I am on Windows 10, I have anaconda installed but I want to create an executable independently in a new, clean minimal environment using python 3.5. So I did some tests:
TEST1: I created a python script test1.py in the folder testenv with only:
print('Hello World')
Then I created the environment, installed pyinstaller and created the executable
D:\testenv> python -m venv venv_test
...
D:\testenv\venv_test\Scripts>activate.bat
...
(venv_test) D:\testenv>pip install pyinstaller
(venv_test) D:\testenv>pyinstaller --clean -F test1.py
And it creates my test1.exe of about 6 Mb
TEST 2: I modified test1.py as follows:
import pandas as pd
print('Hello World')
I installed pandas in the environment and created the new executable:
(venv_test) D:\testenv>pip install pandas
(venv_test) D:\testenv>pyinstaller --clean -F test1.py
Ant it creates my test1.exe which is now of 230 Mb!!!
if I run the command
(venv_test) D:\testenv>python -V
Python 3.5.2 :: Anaconda custom (64-bit)
when I am running pyinstaller I get some messages I do not understand, for example:
INFO: site: retargeting to fake-dir 'c:\\users\\username\\appdata\\local\\continuum\\anaconda3\\lib\\site-packages\\PyInstaller\\fake-modules'
Also I am getting messages about matplotlib and other modules that have nothing to do with my code, for example:
INFO: Matplotlib backend "pdf": added
INFO: Matplotlib backend "pgf": added
INFO: Matplotlib backend "ps": added
INFO: Matplotlib backend "svg": added
I know there are some related questions: Reducing size of pyinstaller exe, size of executable using pyinstaller and numpy but I could not solve the problem and I am afraid I am doing something wrong with respect to anaconda.
So my questions are: what am I doing wrong? can I reduce the size of my executable?
I accepted the answer above but I post here what I did step by step for complete beginners like me who easily get lost.
Before I begin I post my complete test1.py example script with all the modules I actually need. My apologies if it is a bit more complex than the original question but maybe this can help someone.
test1.py looks like this:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import matplotlib.image as image
import numpy as np
import os.path
import pandas as pd
import re
from matplotlib.ticker import AutoMinorLocator
from netCDF4 import Dataset
from time import time
from scipy.spatial import distance
from simpledbf import Dbf5
from sys import argv
print('Hello World')
I added matplotlib.use('Agg') (as my actual code is creating figures) Generating a PNG with matplotlib when DISPLAY is undefined
downloaded python from:
https://www.python.org/downloads/
installed selecting 'add python to path' and deselecting install launcher for all users (I don't have admin rights)
check that I am using the same version from CMD, just writing python
I get:
Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
D:\> mkdir py36envtest
...
D:\py36envtest>python -m venv venv_py36
...
D:\py36envtest\venv_py36\Scripts>activate.bat
Making sure they are compatible to the python version with the command: (from Matplotlib not recognized as a module when importing in Python)
(venv_py36) D:\py36envtest> python -m pip install nameofmodule
NB: in my case I also had to add the option --proxy https://00.000.000.00:0000
for the example I used development version of py installer:
(venv_py36) D:\py36envtest> python -m pip install https://github.com/pyinstaller/pyinstaller/archive/develop.tar.gz
and the modules: pandas, matplolib, simpledbf, scipy, netCDF4. At the end my environment looks like this.
(venv_py36) D:\py36envtest> pip freeze
altgraph==0.15
cycler==0.10.0
future==0.16.0
macholib==1.9
matplotlib==2.1.2
netCDF4==1.3.1
numpy==1.14.0
pandas==0.22.0
pefile==2017.11.5
PyInstaller==3.4.dev0+5f9190544
pyparsing==2.2.0
pypiwin32==220
python-dateutil==2.6.1
pytz==2017.3
scipy==1.0.0
simpledbf==0.2.6
six==1.11.0
style==1.1.0
update==0.0.1
Initially I got a lot of ImportError: DLL load failed (especially for scipy) and missing module error which I solved thanks to these posts:
What is the recommended way to persist (pickle) custom sklearn pipelines?
and the comment to this answer:
Pyinstaller with scipy.signal ImportError: DLL load failed
My inputtest1.spec finally looks like this:
# -*- mode: python -*-
options = [ ('v', None, 'OPTION')]
block_cipher = None
a = Analysis(['test1.py'],
pathex=['D:\\py36envtest', 'D:\\py36envtest\\venv_py36\\Lib\\site-packages\\scipy\\extra-dll' ],
binaries=[],
datas=[],
hiddenimports=['scipy._lib.messagestream',
'pandas._libs.tslibs.timedeltas'],
hookspath=[],
runtime_hooks=[],
excludes=[],
win_no_prefer_redirects=False,
win_private_assemblies=False,
cipher=block_cipher)
pyz = PYZ(a.pure, a.zipped_data,
cipher=block_cipher)
exe = EXE(pyz,
a.scripts,
a.binaries,
a.zipfiles,
a.datas,
name='test1',
debug=False,
strip=False,
upx=True,
runtime_tmpdir=None,
console=True )
(venv_py36) D:\py36envtest>pyinstaller -F --clean inputtest1.spec
my test1.exe is 47.6 Mb, the .exe of the same script created from an anaconda virtual environment is 229 Mb.
I am happy (and if there are more suggestions they are welcome)