I've come accross the following error about html5lib when trying to read an html data frame.
Here is the code:
!pip install html5lib
!pip install lxml
!pip install beautifulSoup4
import html5lib
import lxml
from bs4 import BeautifulSoup
table_list = pd.read_html("http://www.psmsl.org/data/obtaining/")
This is the error:
ImportError Traceback (most recent call last)
<ipython-input-68-e24654a0a301> in <module>()
----> 1 table_list = pd.read_html("http://www.psmsl.org/data/obtaining/")
/home/sage/sage-8.0/local/lib/python2.7/site-packages/pandas/io/html.pyc in read_html(io, match, flavor, header, index_col, skiprows, attrs, parse_dates, tupleize_cols, thousands, encoding, decimal, converters, na_values, keep_default_na)
913 thousands=thousands, attrs=attrs, encoding=encoding,
914 decimal=decimal, converters=converters, na_values=na_values,
--> 915 keep_default_na=keep_default_na)
/home/sage/sage-8.0/local/lib/python2.7/site-packages/pandas/io/html.pyc in _parse(flavor, io, match, attrs, encoding, **kwargs)
737 retained = None
738 for flav in flavor:
--> 739 parser = _parser_dispatch(flav)
740 p = parser(io, compiled_match, attrs, encoding)
741
/home/sage/sage-8.0/local/lib/python2.7/site-packages/pandas/io/html.pyc in _parser_dispatch(flavor)
680 if flavor in ('bs4', 'html5lib'):
681 if not _HAS_HTML5LIB:
--> 682 raise ImportError("html5lib not found, please install it")
683 if not _HAS_BS4:
684 raise ImportError(
ImportError: html5lib not found, please install it
Any help would be much appreciated. Thanks
If you read the error message, you don't have html5lib
installed. Do:
pip install html5lib
in your terminal.
If you are calling from jupyter notebook (just like you did with !
), try to restart the kernel in order to have the packages loaded.