When you think to import XML file in Excel, one of the possible ways is to convert created XML file into CSV format and open it in Microsoft Office Excel.
As a first step, we open an XML source file and validate it. During the validation process, we check the data structure of the XML for errors. If there are a lot of errors in the XML file structure - we couldn't process such file.
On the parsing step we import XML data file, read and understand the data structure and extract the data from the XML.
When the data from the source XML file is extracted, the next step is to transform the data from XML-based representation into the table-representation such as the CSV format.
XML to CSV converter uses the following rules to transform the data:
- Every XML tag represents the separate table column
- Every XML attribute represents the separate table column
- Data is combined based on the upper level XML tags
Once the data is transformed into table representation and combined based on the XML to CSV converter rules, it is saved into the CSV file.
XML is an abbreviation from "Extensible Markup Language".
XML is both machine-readable and human-readable format and can be edited in any text editor.
XML Tags should have a correctly defined names, starting with a character, and not a number.
not valid: <1stWeekData>
Special characters inside XML like <, >, &, ' and " should be escaped as follows:
- < represents "<"
- > represents ">"
- & represents "&"
- ' represents "'"
- " represents '"'
XML encoding is defined in the XML file as the first line:
<?xml version="1.0" encoding="UTF-8"?>
We support all encoding formats. The most popular encoding is "UTF-8".
CSV means Comma-Separated Values.
CSV format is a text file and it can be edited in any Text Editor.
CSV is the easiest way to represent table data and it is human-readable as well as machine-redable format.
File in Excel format has extension: .csv or .tsv or .txt, depending on the program which created it.
CSV file has a delimiter of the values, which can be one of the following (most used) characters:
- Comma "," (sample data: column1,column2,column3), used by default
- Tabulation character (sample data: column1 column2 column3), such file format can be named as Tab-separated Values
- Vertical bar "|" (sample data: column1|column2|column3)
- Semicolon ";" (sample data: column1;column2;column3)