Reputation: 37807
I'm working with a .csv
dataset that I got from here.
I managed to write the code below to store the dataframe weather
in a sql database/table but I don't know how to create a unique ID (eg. OID1, OID2, ...) in the database/table for each row.
import sqlite3 as sql
import pandas as pd
weather = pd.read_csv('londonweather.csv', delimiter=',')
conn = sql.connect('weather.db')
weather.to_sql('weather', conn)
weather.sample(5)
Also, whenever I update the dataframe (by adding some rows), these rows have to be inserted in the database/table with a unique ID (OIDx, with x as the row index in the table).
Is it possible to achieve this ? Do you have any propositions, please ?
Upvotes: 0
Views: 599
Reputation: 37807
After some digging and thanks to the SO posts/answers below, I managed to build the right code for my question :
import sqlite3 as sql
import pandas as pd
### --- READING THE CSV AS A DATAFRAME
df = pd.read_csv('londonweather.csv', delimiter=',')
### --- CREATING THE DATABASE + CURSOR
conn = sql.connect('weather_SO.db')
cur = conn.cursor()
### --- CREATING THE TABLE
cur.execute('''CREATE TABLE IF NOT EXISTS WEATHER_TABLE
(unique_id TEXT PRIMARY KEY, Year INTEGER, Month INTEGER, Tmax REAL, Tmin REAL, Rain REAL, Sun REAL)''')
### --- COPYING THE DATAFRAME TO THE TABLE
df.to_sql('WEATHER_TABLE', conn, if_exists='append', index=False)
### --- UPDATING THE VALUE OF THE UNIQUE ID
cur.execute("UPDATE WEATHER_TABLE set unique_id = 'OID' || rowid")
### --- SAVING CHANGES
conn.commit()
### --- INSERTING NEW ROWS THE TABLE
data_weather = [(2022, 7, 20.0, 20.0, 1.0, 2.0), (2022, 8, 19.3, 20.0, 2.0, 8.0)]
cur.executemany('INSERT INTO WEATHER_TABLE(Year, Month, Tmax, Tmin, Rain, Sun) VALUES (?,?,?,?,?,?)', data_weather)
### --- UPDATING THE UNIQUE ID OF THE NEW ROWS
conn.execute("UPDATE WEATHER_TABLE set unique_id = 'OID' || rowid")
### --- SAVING CHANGES AND CLOSING CONNECTION
conn.commit()
conn.close()
Upvotes: 0
Reputation: 399
import sqlite3 as sql
import pandas as pd
weather = pd.read_csv("weather.csv", delimiter=",")
conn = sql.connect("weather.db")
weather["OID"] = list(map(lambda x: "OID" + str(int(x)), weather.index))
print(weather.head())
output:
Year Month Tmax Tmin Rain Sun OID
0 1957 1 8.7 2.7 39.5 53.0 OID0
1 1957 2 9.0 2.9 69.8 64.9 OID1
2 1957 3 13.9 5.7 25.4 96.7 OID2
3 1957 4 14.2 5.2 5.7 169.6 OID3
4 1957 5 16.2 6.5 21.3 195.0 OID4
rest of the code:
schema = pd.io.json.build_table_schema(weather, primary_key="OID")
weather.to_sql("weather", conn, schema)
print(conn.execute("SELECT * FROM weather").fetchone())
output:
(0, 1957, 1, 8.7, 2.7, 39.5, 53.0, 'OID0')
If you want to add the OID to new rows execute this line again:
weather["OID"] = list(map(lambda x: "OID" + str(int(x)), weather.index))
Just make sure you don't change the index of any of the rows
Upvotes: 1
Reputation: 1701
SQLite creates ROWID for you, unless you specifically instructed it not to do so.
Upvotes: 1