mks212
mks212

Reputation: 921

Import MDB file into Python (pandas) on Mac

I'm running python 3.6 on a mac. I have downloaded an mdb file but do not have Microsoft access, I'd like to import each table into python and work with it there.

I have installed mdbtools and run the following from Spyder:

import pandas as pd
import subprocess
import os

os.chdir('<directory where mdb file is>')

def show_tables(path='avroll_19.mdb'):
    tables = subprocess.check_output(["mdb-tables", path])
    return tables.decode().split()

show_tables()

I get this error: FileNotFoundError: [Errno 2] No such file or directory: 'mdb-tables': 'mdb-tables'

I have also tried this, but get the same error:

import pandas_access as mdb
for tbl in mdb.list_tables('avroll_19.mdb'):
    print(tbl)

I am using Sypder within Anaconda, I am not sure if that is an issue.

The mdb file is located here: https://www1.nyc.gov/assets/finance/downloads/tar/avroll_20.zip

I also attempted to do this using pyodbc, however, it appears that the driver needed for it is not available for mac.

Thank you for your help.

Upvotes: 2

Views: 5476

Answers (5)

Rachel
Rachel

Reputation: 119

First one was super close, you forgot to add the path to chdir and the actual file name should be where the path= is like this

import subprocess import os

os.chdir('Whole_path_goes_here_minus_file_name')

    enter code here


def show_tables(path='File_name_goes_here'):
    tables = subprocess.check_output(["mdb-tables", path])
    print(tables)

    return tables.decode().split()


show_tables()

Console:

OUTPUT: b'ADM2019 C2019_A C2019_B C2019_C C2019DEP CUSTOMCGIDS2019 DRVAL2019 DRVC2019 DRVEF122019 DRVEF2019 DRVF2019 DRVGR2019 DRVIC2019 DRVOM2019 EAP2019 EF2019 EF2019A EF2019A_DIST EF2019B EF2019C EF2019D EFFY2019 EFIA2019 F1819_F1A F1819_F2 Filenames19 FLAGS2019 GR200_19 GR2019 GR2019_L2 HD2019 IC2019 IC2019_AY IC2019_PY IC2019Mission OM2019 S2019_NH S2019_OC S2019_SIS SAL2019_IS SAL2019_NIS SFA1819_P1 SFA1819_P2 SFAV1819 Tables19 Tables19_RV valuesets19 AL2019 DRVADM2019 DRVHR2019 F1819_F3 GR2019_PELL_SSL S2019_IS sectiontable19 vartable19 \n'

Upvotes: 0

breizhmg
breizhmg

Reputation: 1241

I got pandas_access to work following this thread. The issue was that Python required Windows to run mdb-tables.exe, which it did not find. The fix was as simple as:

  • downloading mdbtools-win executables stored on this github
  • adding the folder with the executables to the Windows PATH

Then magically pandas_access.read_table started working in Python:

Before:

IPdb [13]: pandas_access.read_table('Database.accdb', "Table"])
*** FileNotFoundError: [WinError 2] The system cannot find the file specified

After:

IPdb [14]: pandas_access.read_table('Database.accdb', "Table"])
      ID      Table
0      1         A
1      2         B
2      3         S
3      4         X
4      5         Z
...

Upvotes: 0

Gord Thompson
Gord Thompson

Reputation: 123399

I have just confirmed that the following approach works with current versions of pandas, JayDeBeApi, and the UCanAccess JDBC driver. For more details on how to set up the Java/UCanAccess environment, see this answer.

import jaydebeapi
import pandas as pd

db_path = "/home/gord/UCanAccessTest.accdb"

ucanaccess_jars = [
    "/home/gord/Downloads/JDBC/UCanAccess/ucanaccess-5.0.0.jar",
    "/home/gord/Downloads/JDBC/UCanAccess/lib/commons-lang3-3.8.1.jar",
    "/home/gord/Downloads/JDBC/UCanAccess/lib/commons-logging-1.2.jar",
    "/home/gord/Downloads/JDBC/UCanAccess/lib/hsqldb-2.5.0.jar",
    "/home/gord/Downloads/JDBC/UCanAccess/lib/jackcess-3.0.1.jar",
]
classpath = ":".join(ucanaccess_jars)
cnxn = jaydebeapi.connect(
    "net.ucanaccess.jdbc.UcanaccessDriver",
    f"jdbc:ucanaccess://{db_path}",
    ["", ""],
    classpath,
)

df = pd.read_sql_query("SELECT * FROM Clients", cnxn)
print(df)
"""console output:
   ID      LastName FirstName                  DOB
0   1      Thompson      Gord  2017-04-01 07:06:27
1   2        Loblaw       Bob  1996-09-12 16:03:00
"""

Note that this works for reading from Access into pandas but will not work for writing back to Access using to_sql.

Upvotes: 1

Vignesh
Vignesh

Reputation: 1623

I am also getting same file not found error. it is becuase my access data source is older 32bit .mdb and my python is 64 bit.

its working in 32 bit machines

Upvotes: 1

mks212
mks212

Reputation: 921

I have a workaround using R rather than Python. I have referenced material from this post: https://medium.com/@wenyu.z/reading-ms-access-mdb-files-on-mac-969a176baa7a.

First, in terminal run: brew install mdbtools. Please note, this requires that homebrew is already installed.

Second, in R run this:

library(Hmisc)
data <- mdb.get('avroll_19.mdb')

Of course, substitute avroll_19.mdb with your database filename.

Upvotes: 0

Related Questions