How do I pass in parameters to Luigi? if I have a python file called FileFinder.py with a class named getFIles:
class getFiles(luigi.Task):
and I want to pass in a directory to this class such as:
C://Documents//fileName
and then use this parameter in my run method
def run(self):
how do I run this in command line and add the parameter for use in my code? I am accustomed to running this file in command line like this:
python FileFinder.py getFiles --local-scheduler
What do I add to my code to use a parameter, and how do I add that parameter to the command line argument?
Also, as an extension of this question, how would I use multiple arguments? or arguments of different data types such as strings or lists?
As you have already figured out, you can pass arguments to luigi via
--param-name param-value
in the command line. Inside your code, you have to declare these variables by instantiating the Parameter
class or one of it's subclasses. The subclasses are used to tell luigi if the variable has a data-type that is not string. Here is an example which uses two command line arguments, one Int
and one List
:
import luigi
class testClass(luigi.Task):
int_var = luigi.IntParameter()
list_var = luigi.ListParameter()
def run(self):
print('Integer Param + 1 = %i' % (self.int_var + 1))
list_var = list(self.list_var)
list_var.append('new_elem')
print('List Param with added element: ' + str(list_var))
Note that ListParams actually get converted to tuples by luigi, so if you want to do list operations on them, you have to convert them back first (This is a known issue, but doesn't look like it will be fixed soon).
You can invoke the above module from the command line like this (i have saved the code as a file called "testmodule.py" and made the call from inside the same directory):
luigi --module testmodule testClass --int-var 3 --list-var '[1,2,3]' --local-scheduler
Note here that for variables containing a _
, this has to be replaced by -
.
The call yields (along with many status messages):
Integer Param + 1 = 4
List Param with added element: [1, 2, 3, 'new_elem']