I am new to using Python and also new to NetCDF, so apologies if I'm unclear. I have an nc file that has several variables and I need to extract data from those nc files in a new order.
My nc file has 8 variables (longitude, latitude, time, u10, v10, swh, mwd, mwp) and the logic I'm trying is "If I input longitude and latitude, my program outputs other variables (u10, v10, swh, mwd, mwp) ordered by time." Then I would put the extracted data in another database.
I tested my nc file as below:
import netCDF4
from netCDF4 import Dataset
jan = Dataset('2016_01.nc')
print jan.variables.keys()
lon = jan.variables['longitude']
lat = jan.variables['latitude']
time = jan.variables['time']
for d in jan.dimensions.items():
print d
lon_array = lon[:]
lat_array = lat[:]
time_array = time[:]
print lon_array
print lat_array
print time_array
and some of the result is below
[u'longitude', u'latitude', u'time', u'u10', u'v10', u'swh', u'mwd', u'mwp']
(u'longitude', <type 'netCDF4._netCDF4.Dimension'>: name = 'longitude', size = 1440)
(u'latitude', <type 'netCDF4._netCDF4.Dimension'>: name = 'latitude', size = 721)
(u'time', <type 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'time', size = 186)
Any advice would be appreciated. Thank you.
You first need to know the order of the dimensions in the time/space varying variables like e.g. u10
, which you can obtain with:
u10 = jan.variables['u10']
print(u10.dimensions)
Next it is a matter of slicing/indexing the array correctly. If you want data for lets say latitude=30
, longitude = 10
, the corresponding (closest) indexes can be found with (after importing Numpy as import numpy as np
):
i = np.abs(lon_array - 10).argmin()
j = np.abs(lat_array - 30).argmin()
Assuming that the dimensions of u10
are ordered as {time, lat, lon}
, you can read the data as:
u10_time = u10[:,j,i]
Which gives you all (time varying) u10
values for your requested location.