PythonMan
PythonMan

Reputation: 897

How to export array in csv or txt in Python

I'm trying to export array to txt or csv file. I've been trying with numpy but i always get some error like TypeError: Mismatch between array dtype ('<U14') and format specifier ('%.18e')

Here is my code without numpy that works great but I need help with part how to export it.

peoples = []
for content in driver.find_elements_by_class_name('x234'):
    people = content.find_element_by_xpath('.//div[@class="zstrim"]').text
    if people != "Django" and people != "Rooky" :
        pass
        peoples.append([people, 1, datetime.now().strftime("%d/%m/%y %H:%M")])
print(peoples)

Really need some help with this.

Upvotes: 0

Views: 3700

Answers (2)

hpaulj
hpaulj

Reputation: 231325

Looks like you are doing something like:

In [1339]: peoples=[]

In [1340]: for _ in range(3):
   ......:     peoples.append([234, datetime.datetime.now().strftime("%d/%m/%y %H:%M")])
   ......:     

In [1341]: peoples
Out[1341]: [[234, '22/06/16 14:57'], [234, '22/06/16 14:57'], [234, '22/06/16 14:57']]

peoples is an array (or here a list of lists), that contains, among other things formatted dates.

In [1342]: np.savetxt('test.txt',peoples)
...    
TypeError: Mismatch between array dtype ('<U14') and format specifier ('%.18e %.18e')

Since I didn't specify fmt it constructed a default one, consisting of two %.18e fields. That's great for general formatting of numbers. But the data includes 14 characters strings ('U14' - unicode in Python3).

If I tell it to use %s, the generic string format, I get:

In [1346]: np.savetxt('test.txt',peoples, fmt='%s', delimiter=',')

In [1347]: cat test.txt
234,22/06/16 14:57
234,22/06/16 14:57
234,22/06/16 14:57

Not ideal, but still it works. fmt='%20s' would be better.

I glossed over a another nuance. peoples is a list of lists. np.savetxt works with arrays, so it first turns that into an array with:

In [1360]: np.array(peoples)
Out[1360]: 
array([['234', '22/06/16 14:57'],
       ['234', '22/06/16 14:57'],
       ['234', '22/06/16 14:57']], 
      dtype='<U14')

But this turns both columns into U14 strings. So I have to format both columns with %s. I can't use a numeric format on the first. What I need to do first is make a structured array with a numeric field(s) and a string field. I know how to do that, but I won't get into the details now.

As per comments, it could be simpler to format each peoples line as a complete string, and write that to a file.

In [1378]: with open('test.txt','w') as f:
    for _ in range(3):
        f.write('%10d,%20s\n'%(234, datetime.datetime.now().strftime("%d/%m/%y %H:%M")))
   ......:         

In [1379]: cat test.txt
       234,      22/06/16 15:18
       234,      22/06/16 15:18
       234,      22/06/16 15:18

Upvotes: 1

Padraic Cunningham
Padraic Cunningham

Reputation: 180391

hpauj's answer explains you why your code error but using the csv lib and writing as you go is probably a lot easier:

import csv

with open("out.csv", "w") as f:
    wr = csv.writer(f)
    for content in driver.find_elements_by_class_name('x234'):
        people = content.find_element_by_xpath('.//div[@class="zstrim"]').text
        if people != "Django" and people != "Rooky":
            wr.writerow([people, 1, datetime.now().strftime("%d/%m/%y %H:%M")])

Upvotes: 1

Related Questions