I have been working on scikit-learn SVMs for a binary classification problem. I have calculated the features of audio files and wrote them into a CSV file. This is how each row in a CSV file looks like:
"13_10 The Long And Winding Road " "[-6.5633095666136669e-16,-1.56E-15,-3.21E-15,-2.20E-
15,-2.52E-15,-3.04E-15,-3.39E-15,-3.47E-15,-3.07E-15,-6.02E-15,-3.00E-15,-4.77E-15,-3.05E-
15,-2.13E-15,-1.57E-15,-1.87E-15,-2.05E-15,-1.76E-15,-1.38E-15,-9.89E-16,-7.89E-16,-8.99E-
16,-1.09E-15,-7.26E-16,-8.68E-16,-4.68E-16,-2.82E-16,-1.99E-16,-1.75E-16,-2.18E-16,-1.43E-
16,-1.56E-16,-1.91E-16,-1.21E-16,-4.82E-17,-4.39E-17,-2.89E-17,-2.05E-17,0.0]" 0
The first column has the name of the Audio, second column has the feature array and the last element is the label {0,1} for binary classification.
There are 39 float values in the array. I am using the following code to extract them from the CSV file.
with open('File.csv', 'rb') as csvfile:
albumreader = csv.reader(csvfile, delimiter=' ')
data = list()
for row in albumreader:
data.append(row[0:])
data = np.array(data)
X_train = list()
Y_train = list()
k = data.shape[0]
for i in range(k):
feature = data[i][1]
x = map(float, feature[1:-2].split(','))
X_train.append(x)
label = data[i][2]
y = float(label)
Y_train.append(y)
So when I print X_train and Y_train I get exact values in an array. But when I use
clf = svm.SVC(C=1.0, cache_size=200,kernel='linear', max_iter=-1)
clf.fit(X_train,Y_train)
I get the error saying
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-
packages\spyderlib\widgets\externalshell\sitecustomize.py",line 540, in runfile
execfile(filename, namespace)
File "SVM_test.py", line 55, in <module>
clf.fit(X_train,Y_train)
File "sklearn\svm\base.py", line 137, in fit
File "sklearn\utils\validation.py", line 165, in atleast2d_or_csr
File "sklearn\utils\validation.py", line 142, in _atleast2d_or_sparse
File "sklearn\utils\validation.py", line 120, in array2d
File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line 460, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
Can someone help me as to what I can do now? I am really not sure what is happening inside. Both the dimensions of X_train and Y_train are same [X_train has 21 vectors with 39 elements and Y_train has 21 floats {0 or 1}, I don't see what made these errors.
Note: I have a feeling that something might be wrong while I convert the numpy array to string and then a string to a numpy array. Thanks in advance.
Edit: X_Train is very large. Here it is..
[[93812.4999999983, 73189.57452, 48892.17363, 37682.69053, 33709.51536, 20815.68443, 12476.88854, 13364.13645, 9574.010981, 5844.293383, 7910.017736, 12721.38592, 14184.99241, 6988.131481, 9407.380437, 6333.852471, 5688.156663, 7167.61338, 6911.084942, 9210.064235, 5732.338515, 3585.039683, 4433.278772, 4757.658741, 3387.832928, 2711.640327, 2680.255742, 1649.410788, 2024.333977, 997.2348795, 1102.115501, 1386.86396, 1160.477719, 883.941971, 881.2712624, 749.3620066, 885.6355941, 514.1635441, 0.0], [93411.33935126709, 90714.51224, 89773.71828, 61018.71033, 28082.94493, 10120.93228, 11106.07725, 6204.140734, 5968.528906, 4970.099848, 6967.870007, 6990.611982, 7656.630743, 6615.957476, 5573.621516, 8957.245225, 8512.408652, 6976.021692, 7774.215884, 5301.046573, 4666.784091, 2539.587812, 2953.578612, 3529.863917, 2365.101263, 2579.870258, 2890.325096, 3302.179572, 2078.005268, 1425.18236, 1297.961119, 736.4896705, 640.0635888, 819.022382, 659.9559469, 438.2773842, 359.3957991, 193.9937669, 0.0], [95528.45827960827, 79000.64725, 75540.32258, 47915.39365, 29573.63325, 13554.15721, 10101.04124, 6935.685456, 13681.96711, 7726.754596, 9413.96529, 9468.785586, 10479.23762, 10070.81121, 8893.475453, 9517.553541, 8493.077533, 8021.721412, 8568.069341, 7687.282084, 9902.16325, 5442.263263, 5575.258138, 4748.557573, 4580.647869, 3014.91771, 3958.708771, 2851.846841, 3407.31788, 1982.369432, 1937.459179, 1689.049684, 1457.579778, 1055.411047, 1048.471861, 661.6174333, 827.8371903, 414.802354], [101683.46698748806, 62367.04137, 66444.15995, 49621.45404, 31623.19485, 16585.34427, 12271.46378, 12114.5615, 6666.281052, 9335.886213, 19314.70299, 22588.00911, 14133.31813, 12723.03772, 7994.399321, 11447.449, 15457.39519, 7419.208867, 9286.751692, 6128.746537, 5617.886066, 4461.131891, 4651.73188, 5835.270092, 3876.10397, 4499.228748, 2661.999151, 1431.362029, 1378.115091, 1048.827946, 1470.297845, 1087.453644, 825.6318213, 861.5003481, 804.8519616, 397.0719915, 368.8037827, 293.36727], [96614.66763477474, 89674.79785, 73045.22026, 55387.48162, 32450.76131, 26161.93729, 16379.95699, 13446.77762, 6178.297767, 4499.9064, 6128.624979, 4928.968691, 7139.579976, 6442.404748, 7303.917218, 9064.476552, 8246.412739, 4526.169172, 4931.980606, 4022.38625, 3193.080061, 3991.709836, 4894.262891, 4523.545798, 5013.65655, 3165.268896, 2252.272798, 1971.857637, 1543.455559, 1248.305408, 1340.303682, 1069.466847, 1062.971087, 596.4763587, 541.7390803, 481.9598053, 261.6165905, 135.050925], [77116.86410716272, 85174.88022, 48949.81474, 39272.16867, 28721.41507, 26604.82082, 17057.75385, 11417.45143, 12775.94149, 8095.318819, 8318.738856, 7768.406613, 9501.155323, 8215.579012, 5801.439936, 6997.611748, 8358.126592, 6710.072432, 7903.976639, 4770.389995, 4443.449546, 3622.278619, 3628.985312, 4025.879147, 3378.124716, 1681.144815, 1873.675902, 1813.454359, 1203.261884, 734.9896092, 612.7767898, 581.1641439, 554.9952946, 338.9208239, 329.6306536, 210.3361409, 124.684456, 95.1698974], [86000.24134707314, 54315.80346, 61723.06357, 48194.93238, 34145.18298, 18060.21908, 17759.95552, 13594.71484, 10034.81255, 6892.428679, 13609.12234, 11345.97425, 12640.27575, 13636.73634, 8353.154837, 11543.51778, 9620.892875, 5364.536625, 6645.647746, 6939.929388, 6404.367983, 4279.002491, 5473.449778, 5173.72645, 4161.012572, 3189.349797, 1868.016199, 2370.813774, 1991.805589, 1862.750613, 1535.097522, 1195.019326, 824.4997101, 836.5762868, 758.8865079, 739.0096703, 426.339462, 495.362511], [88356.8775920093, 68677.18631, 56499.17126, 41069.83582, 34004.99481, 21584.94408, 16827.63584, 10875.88263, 8838.404327, 10399.33201, 10247.97332, 11592.57345, 6888.99984, 8027.86374, 4396.353004, 4926.542018, 4160.408132, 4829.051031, 5104.507749, 4445.908694, 4113.401198, 2070.059053, 2331.063956, 3091.764189, 2708.490628, 1357.792132, 1476.379979, 1099.46743, 895.2046416, 1017.410994, 855.9326154, 807.2299975, 817.8896259, 688.1633806, 620.1147918, 404.4791452, 355.3012015, 155.124636], [129161.3158422606, 99871.12426, 69682.53863, 42152.57846, 27722.10719, 16851.46834, 12503.65957, 15820.8482, 10208.86252, 3737.281589, 11388.29292, 9216.418551, 8412.969115, 8915.691889, 7214.795344, 6312.935476, 5691.760401, 4452.333587, 6080.803383, 3169.211512, 4640.513939, 2965.070935, 2603.678979, 3427.596811, 2650.097593, 3407.197764, 2399.210804, 1585.540133, 900.6057596, 1562.799097, 1414.458688, 1085.727804, 862.853398, 1046.809149, 1299.422095, 452.1395434, 416.0278005, 342.487369], [97676.58730158686, 85928.37013, 60031.54702, 50283.65633, 30440.49477, 23396.44028, 17693.84492, 13834.72723, 13079.6, 9484.172923, 11026.12866, 15489.77935, 14751.23748, 7719.575611, 6916.062149, 9947.922301, 9860.230801, 6685.554777, 5314.504743, 6412.026375, 5126.472976, 3994.412881, 3469.94381, 3087.75188, 2150.012155, 2510.441776, 1633.896465, 1468.22101, 1451.997957, 1594.288508, 1208.749937, 1539.411357, 846.1440547, 1015.738147, 760.9050287, 531.4752058, 352.2906744, 256.992846], [99873.48353552721, 96128.33417, 56062.95108, 48316.51261, 33803.61475, 20090.40769, 14532.69355, 16973.62408, 11745.412, 10555.56359, 12415.12332, 11311.00716, 13055.02538, 13457.43473, 11949.02017, 13726.34027, 13210.19444, 6924.913491, 7526.293551, 6489.797287, 7504.193589, 3693.345327, 3173.144967, 4589.951959, 3817.607517, 2296.577132, 4241.66248, 2298.259695, 2104.233705, 1894.800787, 1435.902299, 1237.861542, 1008.052264, 743.557111, 447.3644689, 360.231905, 263.6887002, 252.53243], [118318.40927047582, 96894.04475, 72455.95855, 53538.90521, 34270.2485, 14028.66282, 6110.994324, 10831.06944, 6500.061124, 5648.546259, 9746.722376, 11098.67455, 12414.31738, 11859.15818, 5661.36057, 6467.490449, 7160.019668, 4986.101354, 4805.715894, 4384.860917, 4818.433908, 2776.480858, 2906.711958, 4180.355966, 3029.563639, 2121.677425, 2977.055372, 1650.875378, 1328.284924, 1641.967101, 1374.844716, 1269.983055, 756.2822371, 746.9782069, 635.1025738, 901.5181204, 500.4240422, 124.234986], [99496.1074660524, 91134.19642, 64615.65163, 51749.95315, 27017.75136, 17498.19736, 8686.464718, 6354.494714, 6279.181765, 6011.661362, 9583.683802, 16802.58819, 12848.82539, 12448.85086, 9717.906293, 6025.712047, 8968.944145, 6116.427844, 8009.500521, 5857.252734, 5994.629798, 4602.865888, 5568.279578, 3847.961198, 3664.838032, 2285.641295, 2343.300802, 1538.656643, 1595.004126, 1438.685894, 1278.233128, 1138.847548, 1387.660031, 727.3346259, 443.3437923, 399.422316, 202.3671643, 210.818774], [97897.81181619188, 81534.24658, 86124.34023, 55859.41234, 43498.35095, 16317.93548, 9240.704588, 8335.639737, 5398.77203, 2959.587234, 7638.934756, 9237.569061, 9669.92492, 6395.762472, 5297.481894, 4628.757031, 5965.00084, 5360.168945, 4918.802753, 5403.035015, 7760.124783, 4316.46269, 3586.003412, 4862.517393, 2722.334238, 1950.153709, 2308.64693, 1738.602095, 1431.956923, 1195.875585, 903.5619486, 628.8441079, 378.5951575, 279.5559759, 290.8523867, 185.8872588, 124.4224622, 102.6474251], [92929.03125177219, 66037.16827, 85713.22692, 60594.81708, 21299.03928, 9728.745394, 7164.560274, 7530.287996, 3986.197072, 4768.423334, 7965.588661, 6884.742393, 7813.113615, 6783.772795, 5068.375149, 5563.205324, 4549.089711, 4178.977925, 7176.864923, 3595.204266, 4075.654498, 3667.874878, 5018.867408, 4632.204595, 4236.022945, 2419.634542, 1965.732854, 2017.314496, 1125.444672, 1776.994722, 1380.972752, 877.5693874, 1048.039171, 698.4293241, 587.3589805, 425.3561446, 374.9688448, 242.143167], [112279.4547224921, 85626.58906, 82479.88981, 41194.72139, 18581.67331, 17171.661, 11041.06798, 7470.697485, 5647.489476, 5413.921458, 6258.45235, 7817.02576, 5690.588758, 5018.057148, 2835.675844, 4192.365122, 5264.669752, 2899.863762, 4722.075443, 4359.368543, 4475.52712, 4364.193393, 4366.760559, 4466.265791, 3581.127965, 3229.694902, 3061.592084, 2761.368431, 2924.520852, 2278.74424, 1842.130366, 1353.160812, 1061.970453, 801.4987863, 559.7692834, 542.6125554, 365.0923416, 345.207555], [103517.74691357967, 75660.08945, 77823.62831, 44178.34395, 30627.84085, 16822.14795, 12153.82383, 10477.03604, 6737.154621, 3948.567091, 5952.492101, 6657.190597, 8458.524435, 4644.542091, 3262.595869, 6196.748153, 4725.493005, 3131.648336, 3043.832975, 2397.211069, 2221.444205, 1846.007568, 1906.256992, 2565.899774, 1879.678929, 1983.431392, 2057.925713, 1379.158985, 1161.566123, 1269.932159, 1882.60896, 2175.463202, 1945.131584, 2617.451168, 1724.479089, 934.2682688, 703.1608361, 325.546], [93568.70860926574, 94747.49539969431, 46432.77848910925, 34021.920729891295, 20420.991633759644, 11780.466421174959, 11808.677934216039, 8356.053623407755, 5251.866299007888, 1837.1346095714694, 3388.9444867864604, 4160.840941722876, 3099.1858407062873, 2498.7020047692304, 2320.2543950190047, 3123.103649596546, 2353.3994748227874, 2282.188646923959, 2461.343070326571, 1867.3943363024234, 2097.623570329987, 2009.6550578189285, 3308.2027220730074, 2765.1388333698114, 1557.1149504889588, 1365.611056602633, 1960.1916988919656, 1357.4554303292293, 1174.7058492639005, 1132.4243713198962, 845.4972050001261, 1168.2703928255426, 792.9426082625839, 670.3864757268884, 462.94718979251763, 418.287362938786, 440.0999154920886, 177.49705973335398, 0.0], [95355.29766162521, 80982.187596427, 52611.83577098383, 32094.021907352668, 17954.16900608238, 10539.432714398477, 8455.780912660928, 7826.206728864228, 6509.019127983875, 3428.593131775805, 4133.834750579424, 4897.866949416399, 4527.962919676826, 3097.9755890532115, 2644.294656259542, 3715.9623636641186, 2820.3307205895694, 2502.4555417041665, 4294.009887075389, 3305.480815069842, 3473.739729060158, 3436.008663252062, 2646.057627969427, 2915.118003316749, 2807.214040627724, 2182.2047542975124, 2307.7279832228096, 2051.914227220658, 1701.1785697138466, 1387.86622139378, 1717.4780638249865, 1444.4320186566786, 1543.3397450160378, 1008.7972827019012, 804.7763630817929, 727.0076251244793, 661.7971983605773, 328.023137248546, 0.0], [106950.75545607107, 83171.33894927795, 98570.84168082179, 53995.601284217235, 34045.2113137451, 32682.002511908893, 20258.01771044016, 17863.78159524713, 10999.026649078776, 7606.650910143417, 8186.182643934389, 12307.240199947704, 6014.871257290792, 4781.08981401508, 5131.609324634855, 4391.107045269739, 4364.496837469433, 3795.810404058682, 5693.929878923241, 3511.0866864164072, 4967.40355405853, 3290.291028496737, 2401.232195128987, 2787.2578565673602, 2210.985797970096, 2106.714353398232, 1799.725035771931, 2223.1076215378416, 1189.234114777526, 1003.1624544891614, 1046.7700894681655, 812.1805254193989, 750.3209854314467, 893.172975198784, 492.44092578555313, 379.87738447537436, 169.4616484512177, 100.56120686339501, 0.0]]
Y_Train is just the labels for individual sets of features. It is like this:
[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]
Hope that helps!
Finally I found the answer to my question with the help of some ideas from @larsmans and @eickenberg. The problem was that X_train did not have the same number of elements in all the arrays. So, it was not able to form a 2D array. Now that I have added an additional value to that array, the dimensionality matched for all the 1D arrays and X_train was able to form a 2D array. Thanks for the ideas guys!