Python3 : module 'tabula' has no attribute 'read_pdf'

Sukhi picture Sukhi · Feb 24, 2020 · Viewed 7.8k times · Source

A .py program works but the exact same code, when exposed as API, doesn't work.

The code reads the pdf with Tabula and provides the table content as a output.

I've tried :

import tabula
df = tabula.read_pdf("my_pdf")
print(df)

and

from tabula import wrapper
df = wrapper.read_pdf("my_pdf")
print(df)

I've installed tabula-py (not tabula) on AWS EC2 running Ubuntu.

More than read_pdf, I actually want to convert to CSV and give the output. But that doesn't work as well. I get the same no-attribute error i.e. module 'tabula' has no attribute 'convert_into.

The .py file and the API file (.py as well) are in the same directory and are accessed with the same user.

Any help will be highly appreciated.

EDIT : I tried to run the same python file from the API as OS command (os.system("python3 /home/ubuntu/flaskapp/tabler.py")). But it didn't work as well.

Answer

Skippy le Grand Gourou picture Skippy le Grand Gourou · Feb 2, 2021

There is actually an entry in the FAQ about this issue specifically :

If you’ve installed tabula, it will be conflict the namespace. You should install tabula-py after removing tabula.

Although using read_csv() from tabula.io worked, as suggested by other answers, I was also able to use tabula.read_csv() after having removed tabula and reinstalled tabula-py (using pip install --force-reinstall tabula-py).