I am using a dataset containing over 3000 images for transfer learning. This is part of code:
import glob
import numpy as np
import os
import shutil
np.random.seed(42)
files = glob.glob('train/*')
cat_files = [fn for fn in files if 'cat' in fn]
dog_files = [fn for fn in files if 'dog' in fn]
len(cat_files), len(dog_files)
cat_train = np.random.choice(cat_files, size=1500, replace=False)
It is hard to tell exactly what's going on without some sample data from train/
, but a google search for your error message turned up this, from the source code for np.random.choice()
:
def choice(self, a, size=None, replace=True, p=None):
...
Raises
-------
ValueError
If a is an int and less than zero, if a or p are not 1-dimensional,
if a is an array-like of size 0, if p is not a vector of
probabilities, if a and p have different lengths, or if
replace=False and the sample size is greater than the population
size
...
# Format and Verify input
a = np.array(a, copy=False)
if a.ndim == 0:
try:
# __index__ must return an integer by python rules.
pop_size = operator.index(a.item())
except TypeError:
raise ValueError("'a' must be 1-dimensional or an integer")
if pop_size <= 0 and np.prod(size) != 0:
raise ValueError("'a' must be greater than 0 unless no samples are taken")
It appears that maybe cat_files
is empty, or not of the correct type. Have you verified its contents before passing it to np.random.choice()
?