J.S.Orris
J.S.Orris

Reputation: 4821

How do I awk a unix text file to a predefined html code?

I don't know HTML (HORRIBLY EMBARRASSED but didn't ever have the need to). I am pretty perspicacious when it comes to UNIX however I am horribly confused with this assignment I have. I know what I need to do but am having the hardest time ever getting started.

I have the following files in my hwk12 directory:

The following is the content of the roster.html file:

<html>
<body>
<table border=2>
<tr><th>Name</th><th>Username</th><th>Email</th></tr>
  <tr>
    <td>Nikhil Banerjee</td>
    <td>nbanerje</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Jeff Nazarian</td>
    <td>jnazaria</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Anna Melzer</td>
    <td>amelzer</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Jose Garcia</td>
    <td>jgarcia</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Jillian Testa</td>
    <td>jtesta</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Clayton Lengelzigich</td>
    <td>clengelz</td>
    <td><a href="mailto:[email protected]">clayton.lengel-  
[email protected]</a></td>               
  </tr>
  <tr>
    <td>Ashley Bennett</td>
    <td>abennett</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Ann Frost</td>
    <td>afrost</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Timothy Whipple</td>
    <td>twhipple</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Wei Shen</td>
    <td>wshen</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Cari Mahon</td>
    <td>cmahon</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Alberto Salas</td>
    <td>asalas</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Dorothy Haskett</td>
    <td>dhaskett</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Criss Bradbury</td>
    <td>cbradbur</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Steve Ellermann</td>
    <td>sellerma</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Zewdie Bekele</td>
    <td>zbekele</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Frederic Diziere</td>
    <td>fdiziere</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Matt Bowes</td>
    <td>mbowes</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Jasen Meece</td>
    <td>jmeece</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Aaron Carpenter</td>
    <td>acarpent</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Binqin Xi</td>
    <td>bxi</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Yinting Chan</td>
    <td>ychan</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Michael Evans</td>
    <td>mevans</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Herman Beringer</td>
    <td>hberinge</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Andrew Jolley</td>
    <td>ajolley</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Michael Raby</td>
    <td>mraby</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Hajar Alaoui</td>
    <td>halaoui</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Anne Lemar</td>
    <td>alemar</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Russell Crotts</td>
    <td>rcrotts</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Dan Mazzola</td>
    <td>dmazzola</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Bill Boyton</td>
    <td>bboyton</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
</table>
</body>
</html>

The following is the content of the roster.txt file:

Whipple Timothy [email protected] Shen    Wei     [email protected]
Mahon   Cari    [email protected] Salas   Alberto [email protected]
Haskett Dorothy [email protected] Bradbury        Criss  
[email protected] Ellermann       Steve  
[email protected] Bekele  Zewdie  [email protected] Diziere Frederic 
[email protected] Bowes   Matt    [email protected] Meece   Jasen  
[email protected]  Carpenter       Aaron   [email protected]
Xi      Binqin  [email protected] Chan    Yinting [email protected]
Evans   Michael [email protected] Beringer        Herman 
[email protected] Jolley  Andrew  [email protected] Raby    Michael
[email protected] Alaoui  Hajar   [email protected] Lemar   Anne   
[email protected] Crotts  Russell [email protected] Mazzola Dan 
[email protected] Boyton  Bill    [email protected]

The following is the content of the sample.html file:

<html>
<body>
<table border=2>
<tr><th>Name</th><th>Username</th><th>Email</th></tr>
  <tr>
    <td>Michael Raby</td>
    <td>mraby</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Hajar Alaoui</td>
    <td>halaoui</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Anne Lemar</td>
    <td>alemar</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Russell Crotts</td>
    <td>rcrotts</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Dan Mazzola</td>
    <td>dmazzola</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
  <tr>
    <td>Bill Boyton</td>
    <td>bboyton</td>
    <td><a href="mailto:[email protected]">[email protected]</a></td>
  </tr>
</table>
</body>
</html>

The following is the content of the sample.txt file:

Raby    Michael [email protected]
Alaoui  Hajar   [email protected]
Lemar   Anne    [email protected]
Crotts  Russell [email protected]
Mazzola Dan     [email protected]
Boyton  Bill    [email protected]

I'm not asking for someone to do this for me because I LOVE UNIX and I want to learn it myself. Everytime I look at this HTML code I am confusing the #$$#& out of myself. I need help getting started.

The homework prompt is the following:

You are to write a nawk(1) script called ~/hwk12/mk_html.awk that converts a text file (sample.txt and roster.txt) to an html page that a web browser can read. I have given you the output in the file sample.html which is reproduced below (notice how each level of indentation is two spaces deep):

Again, I don't want someone to do this for me. Im just confused as to how data in the text file will append to the HTML table without the actual HTML code. Can someone please help me get started?

Upvotes: 1

Views: 483

Answers (1)

Otaia
Otaia

Reputation: 451

Looks like you'll need to define the necessary HTML tags within your script. The meat of the html file will be these lines:

<tr>
    <td>$first $last</td>
    <td>$username</td>
    <td><a href="mailto:$email">$email</a></td>
</tr>

These tags define a table row. You can parse the variables from the text files with awk and use them to fill in the html. The other html markup can be copy-pasted as static text into the output html file.

Edit: You can do this to grab the first and last name and print to the html file.

last = $1
first = $2
print "  <tr>"
print "    <td>" first " " last "</td>"
print "  </tr>"

You just need to expand that to get the email and username.

Upvotes: 1

Related Questions