user556171
user556171

Reputation: 3

Convert xml (mixed nodes and attributes) to csv file using c#

I have an xml which has mix of nodes and nodes with attributes as shown below

<?xml version="1.0" encoding="utf-8"?>
<Root> 
    <StudentRequestList count="3">
    <StudentRequest>
        <StudentFirstName>Test1</StudentFirstName>
        <StudentLastName>TestLastName</StudentLastName>
        <StudentGrade>3</StudentGrade>
        <StudentHomeroomTeacher>WhiteTest1</StudentHomeroomTeacher>
        <VariableData>
            <Variable name="Email">[email protected]</Variable>
            <Variable name="Hobby1">TestHobby1</Variable>
            <Variable name="Hobby2">TestHobby2</Variable>
            <Variable name="Hobby3">TestHobby3</Variable>
            <Variable name="satscore">satscoreTest1</Variable>
            <Variable name="kprepscore">krepscore1</Variable>           
        </VariableData>
    </StudentRequest>
        <StudentRequest>
        <StudentFirstName>Test2</StudentFirstName>
        <StudentLastName>TestLastName2</StudentLastName>
        <StudentGrade>3</StudentGrade>
        <StudentHomeroomTeacher>WhiteTest1</StudentHomeroomTeacher>
        <VariableData>
            <Variable name="Email">[email protected]</Variable>
            <Variable name="Hobby5">TestHobby5</Variable>
            <Variable name="Hobby6">TestHobby6</Variable>
            <Variable name="Hobby3">TestHobby3</Variable>
            <Variable name="satscore">satscoreTest2</Variable>
            <Variable name="kprepscore">krepscore2</Variable>           
        </VariableData>
    </StudentRequest>
        <StudentRequest>
        <StudentFirstName>Test1</StudentFirstName>
        <StudentLastName>TestLastName</StudentLastName>
        <StudentGrade>3</StudentGrade>
        <StudentHomeroomTeacher>WhiteTest1</StudentHomeroomTeacher>
        <VariableData>
            <Variable name="Email">[email protected]</Variable>
            <Variable name="Hobby4">TestHobby4</Variable>
            <Variable name="Hobby2">TestHobby2</Variable>
            <Variable name="Hobby3">TestHobby3</Variable>
            <Variable name="satscore">satscoreTest3</Variable>
            <Variable name="kprepscore">krepscore3</Variable>           
        </VariableData>
    </StudentRequest>
    </StudentRequestList>
    </Root>



CSV is supposed to look like below

StudentFirstName,StudentLastName,StudentGrade,StudentHomeroomTeacher,Email, Hobby1, Hobby2,Hobby3, Hobby4, Hobby5, Hobby6, satscore, kprepscore     
Test1,TestLastName,3,WhiteTest1,[email protected],TestHobby1,TestHobby2,TestHobby3,,,,satscoreTest1,krepscore1 
Test2,TestLastName2,3,WhiteTest1,[email protected],,,Hobby3,,Hobby5,Hobby6,satscoreTest2,krepscore2
Test1,TestLastName,3,WhiteTest1,[email protected],,Hobby2,Hobby3,Hobby4,,,satscoreTest3,krepscore3

I am able to extract the values for nodes without attributes (as shown in code below) but for other nodes with attributes (variabledata/variable) struggling to get values so that i can create the csv file as shown above. Any suggestions are appreciated. Thanks.

  XDocument custOrd = XDocument.Load(FILENAME);

                string csv =
                                    (from el in custOrd.Element("Root").Element("StudentRequestList").Elements("StudentRequest")
                                     select
                             String.Format("{0},{1},{2},{3},{4}{5}",
                                 (string)el.Element("StudentFirstName"),
                                 (string)el.Element("StudentLastName"),
                                 (string)el.Element("StudentGrade"),
                                 (string)el.Element("StudentHomeroomTeacher"),
                                el.Element("VariableData").Element("Variable").Attribute("name").Value.Equals("Email") ? (string)el.Element("VariableData").Element("Variable") : null,                                
                                 Environment.NewLine
                             )
                                    )
                    .Aggregate(
                        new StringBuilder(),
                        (sb, s) => sb.Append(s),
                        sb => sb.ToString()
                    );
                Console.WriteLine(csv);
                Console.ReadLine();

Upvotes: 0

Views: 421

Answers (1)

Oguz Ozgul
Oguz Ozgul

Reputation: 7187

Here is a small method that can wrap the attribute reading logic

private static string ValueOf(XElement el, string name)
{
    return (string)el.Element("VariableData")
        .Elements("Variable")
        .FirstOrDefault(x => x.Attribute("name").Value.Equals(name));
}

By employing this method in our select statement for each <Variable> element with different names, we can read all the variables that exist and return "" for the ones that don't.

I am also re-writing your string format template because it has insufficient number of parameters for what you want to write:

Now your select statement should look like this:

from el in custOrd.Element("Root").Element("StudentRequestList").Elements("StudentRequest")
select
    String.Format("{0},{1},{2},{3},{4},{5},{6},{7},{8},{9},{10},{11},{12}",
        (string)el.Element("StudentFirstName"),
        (string)el.Element("StudentLastName"),
        (string)el.Element("StudentGrade"),
        (string)el.Element("StudentHomeroomTeacher"),
        ValueOf(el, "Email"), ValueOf(el, "Hobby1"), ValueOf(el, "Hobby2"), ValueOf(el, "Hobby3"),
        ValueOf(el, "Hobby4"), ValueOf(el, "Hobby5"), ValueOf(el, "Hobby6"),
        ValueOf(el, "satscore"),
        ValueOf(el, "kprepscore"),
        Environment.NewLine

Upvotes: 0

Related Questions