C#: read text file and process it

I need a program in C# which is write out

  1. how many Eric Clapton songs played in the radios.
  2. is there any eric clapton song which all 3 radios played.
  3. how much time they broadcasted eric clapton songs in SUM.

The first columns contain the radio identification(1-2-3) The second column is about the song playtime minutes the third column is the song playtime in seconds the last two is the performer : song

So the file looks like this:

1 5 3 Deep Purple:Bad Attitude

2 3 36 Eric Clapton:Terraplane Blues

3 2 46 Eric Clapton:Crazy Country Hop

3 3 25 Omega:Ablakok

2 4 23 Eric Clapton:Catch Me If You Can

1 3 27 Eric Clapton:Willie And The Hand Jive

3 4 33 Omega:A szamuzott

................. And more 670 lines.

so far i get this:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    using System.IO;

    namespace radiplaytime
    {
        public struct Adat
        {
            public int rad;
            public int min;
            public int sec;
            public Adat(string a, string b, string c)
            {
                rad = Convert.ToInt32(a);
                min = Convert.ToInt32(b);
                sec = Convert.ToInt32(c);
            }
        }
    class Program
    {
        static void Main(string[] args)
        {

            String[] lines = File.ReadAllLines(@"...\zenek.txt");
            List<Adat> adatlista = (from adat in lines
                                        //var adatlista = from adat in lines
                                    select new Adat(adat.Split(' ')[0],
                                                    adat.Split(' ')[1],
                                                    adat.Split(' ')[2])).ToList<Adat>();

            var timesum = (from adat in adatlista
                              group adat by adat.rad into ertekek
                              select new
                              {
                                  rad = ertekek.Key,
                                  hour = (ertekek.Sum(adat => adat.min) +
                                  ertekek.Sum(adat => adat.sec) / 60) / 60,

                                  min = (ertekek.Sum(adat => adat.min) +
                                  ertekek.Sum(adat => adat.sec) / 60) % 60,

                                  sec = ertekek.Sum(adat => adat.sec) % 60,

                              }).ToArray();
            for (int i = 0; i < timesum.Length; i++)
            { 
                Console.WriteLine("{0}. radio: {1}:{2}:{3} playtime",
                    timesum[i].rad,
                    timesum[i].hour,
                    timesum[i].min,
                    timesum[i].sec);
            }
            Console.ReadKey();
        }
    }
}

Upvotes: 0

Views: 131

Answers (2)

Panagiotis Kanavos
Panagiotis Kanavos

Reputation: 131364

String splitting works only if the text is really simple and doesn't have to deal with fixed length fields. It generates a lot of temporary strings as well, that can cause your program to consume many times the size of the original in RAM and harm performance due to the constant allocations and garbage collection.

Riv's answer shows how to use a Regex to parse this file. It can be improved in several ways though:

var pattern=@"^(\d+)\s(\d+)\s(\d+)\s(.+)\:(.+)$";
var regex=new Regex(pattern);
var plays = from line in File.ReadLines(filePath)
            let matches=regex.Match(line)
            select new Plays {
                          RadioID = int.Parse(matches.Groups[1].Value),
                          PlayTimeMinutes = int.Parse(matches.Groups[2].Value),
                          PlayTimeSeconds = int.Parse(matches.Groups[3].Value),
                          Performer = matches.Groups[4].Value,
                          Song = matches.Groups[5].Value 
                       };
  1. ReadLines returns an IEnumerable<string> instead of returning all of the lines in a buffer. This means that parsing can start immediatelly
  2. By using a single regular expression, we don't have to rebuild the regex for each line.
  3. No list is needed. The query returns an IEnumerable to which other LINQ operations can be applied directly.

For example :

var durations = plays.GroupBy(p=>p.RadioID)
                     .Select(grp=>new { RadioID=grp.Key,
                                        Hours = grp.Sum(p=>p.PlayTimeMinutes + p.PlayTimeSecons/60)/60,)
                                        Mins = grp.Sum(p=>p.PlayTimeMinutes + p.PlayTimeSecons/60)%60,)
                                        Secss = grp.Sum(p=> p.PlayTimeSecons)%60)
                              });

A farther improvement could be to give names to the groups:

var pattern=@"^(?<station>\d+)\s(?<min>\d+)\s(?<sec>\d+)\s(?<performer>.+)\:(?<song>.+)$";

...
            select new Plays {
                          RadioID = int.Parse(matches.Groups["station"].Value),
                          PlayTimeMinutes = int.Parse(matches.Groups["min"].Value),
...
                       };

You can also get rid of the Plays class and use a single, slightly more complex LINQ query :

var durations = from line in File.ReadLines(filePath)
            let matches=regex.Match(line)
            let play= new {
                          RadioID = int.Parse(matches.Groups["station"].Value),
                          Minutes = int.Parse(matches.Groups["min"].Value),
                          Seconds = int.Parse(matches.Groups["sec"].Value)
                       }
            group play by play.RadioID into grp
            select new { RadioID = grp.Key,
                         Hours   = grp.Sum(p=>p.Minutes + p.Seconds/60)/60,)
                         Mins    = grp.Sum(p=>p.Minutes + p.Seconds/60)%60,)
                         Secs    = grp.Sum(p=> p.Seconds)%60)
            };

In this case, no strings are generated for Performer and Song. That's another benefit of regular expressions. Matches and groups are just indexes into the original string. No string is generated until the .Value is read. This would reduce the RAM used in this case by about 75%.

Once you have the results, you can iterate over them :

foreach (var duration in durations)
{ 
    Console.WriteLine("{0}. radio: {1}:{2}:{3} playtime",
        duration.RadioID,
        duration.Hours,
        duration.Mins,
        duration.Secs);
}

Upvotes: 0

Riv
Riv

Reputation: 1859

You can define a custom class to store the values of each line. You will need to use Regex to split each line and populate your custom class. Then you can use linq to get the information you need.

public class Plays
    {
        public int RadioID { get; set; }
        public int PlayTimeMinutes { get; set; }
        public int PlayTimeSeconds { get; set; }
        public string Performer { get; set; }
        public string Song { get; set; }
    }

So you then read your file and populate the custom Plays:

String[] lines = File.ReadAllLines(@"songs.txt");
List<Plays> plays = new List<Plays>();
foreach (string line in lines)
{
    var matches = Regex.Match(line, @"^(\d+)\s(\d+)\s(\d+)\s(.+)\:(.+)$"); //this will split your line into groups
    if (matches.Success)
    {
        Plays play = new Plays();
        play.RadioID = int.Parse(matches.Groups[1].Value);
        play.PlayTimeMinutes = int.Parse(matches.Groups[2].Value);
        play.PlayTimeSeconds = int.Parse(matches.Groups[3].Value);
        play.Performer = matches.Groups[4].Value;
        play.Song = matches.Groups[5].Value;
        plays.Add(play);
    }
}

Now that you have your list of songs, you can use linq to get what you need:

 //Get Total Eric Clapton songs played - assuming distinct songs
 var ericClaptonSongsPlayed = plays.Where(x => x.Performer == "Eric Clapton").GroupBy(y => y.Song).Count();

//get eric clapton songs played on all radio stations
var radioStations = plays.Select(x => x.RadioID).Distinct();
var commonEricClaptonSong = plays.Where(x => x.Performer == "Eric Clapton").GroupBy(y => y.Song).Where(z => z.Count() == radioStations.Count());

etc.

Upvotes: 1

Related Questions