Top "Data-cleaning" questions

Data cleaning is the process of removing or repairing errors, and normalizing data used in computer programs.

modelform: override clean method

I have two questions concerning the clean method on a modelform. Here is my example: class AddProfileForm(ModelForm): ... password = forms.…

django overriding modelform data-cleaning
How do I clean twitter data in R?

I extracted tweets from twitter using the twitteR package and saved them into a text file. I have carried out …

r twitter text-mining data-cleaning
'float' object has no attribute 'strip'

I want to clean one column of my df['emp_length'] [shown in the screen shot]1 but when I use …

python pandas dataframe strip data-cleaning
Fill in missing pandas data with previous non-missing value, grouped by key

I am dealing with pandas DataFrames like this: id x 0 1 10 1 1 20 2 2 100 3 2 200 4 1 NaN 5 2 NaN 6 1 300 7 1 NaN I would like to replace each NAN …

python pandas nan missing-data data-cleaning
Blocking '0000-00-00' from MySQL Date Fields

I have a database where old code likes to insert '0000-00-00' in Date and DateTime columns instead …

mysql datetime date data-cleaning
Filter pandas data frame for col == None

I have a data frame data_df with multiple columns, one of which is c which holds country names. How …

python pandas subset data-cleaning
How to clear / maintain a django-sentry database?

I am using django-sentry to track errors in a website. My problem is that the database has grown too big. …

django sentry data-cleaning
How to clean large malformed CSV file using Python

I'm attempting to use Python 2.7.5 to clean up a malformed CSV file. The CSV file is fairly large (over 1GB). …

python csv data-cleaning malformed
Remove special characters from entire dataframe in R

Question: How can you use R to remove all special characters from a dataframe, quickly and efficiently? Progress: This SO …

r data-science data-cleaning
R - select only factor columns of dataframe

I am trying to select only factor columns from my data frame. Example is below: bank[,apply(bank[,names(bank)!="…

r dataframe data-science data-cleaning