I am tasked with extracting the data from the data files of an old software - CIMplicity HMI Plant Edition version 6.0. Its a SCADA software from 2002. I have a copy of the data files directory which contains a lot of *.DAT and *.IDX files. I am required to extract this data to CSV or to an SQL database. Some of the DAT files are just plain-text, but others have a binary-like format and when opened in PSPad, shows up in HEX view mode.
What tools can I use to reliably read and extract the data from these files?
TIA.
UPDATE: I've added a directory listing of the directory with the data files:
Directory of C:\tmp\xxxxxxII\data
04/30/2013 01:53 PM <DIR> .
04/30/2013 01:53 PM <DIR> ..
09/02/2008 10:46 AM 17,260 1220323606.clz
09/02/2008 10:46 AM 60,490 1220323607.clz
09/10/2008 06:36 PM 288,554 1220323608.clz
09/02/2008 10:46 AM 66,977 1220323609.clz
09/10/2008 06:37 PM 23,900 1220323610.clz
09/10/2008 06:37 PM 19,162 1220323611.clz
09/10/2008 06:48 PM 37,596 1220323612.clz
09/10/2008 06:49 PM 27,882 1220323613.clz
09/10/2008 06:49 PM 47,850 1220323614.clz
09/10/2008 06:50 PM 47,816 1220323615.clz
09/10/2008 06:52 PM 3,427,511 1220323616.clz
09/02/2008 10:46 AM 31,169 1220323617.clz
09/10/2008 06:53 PM 30,353 1220323618.clz
09/02/2008 10:46 AM 122,159 1220323619.clz
09/02/2008 10:50 AM 3,539,414 1220323828.clz
09/10/2008 06:02 PM 208 action.dat
09/10/2008 06:02 PM 3,072 action.idx
02/19/2002 11:58 PM 5,636 alarm_class.dat
02/19/2002 11:58 PM 3,072 alarm_class.idx
09/23/2008 04:26 PM 137,128 alarm_def.dat
09/23/2008 04:26 PM 49,152 alarm_def.idx
02/19/2002 11:58 PM 2,929 alarm_field.dat
02/19/2002 11:58 PM 4,096 alarm_field.idx
02/19/2002 11:58 PM 0 alarm_intproc.dat
02/19/2002 11:58 PM 4,096 alarm_intproc.idx
02/19/2002 11:58 PM 135 alarm_mgr.dat
02/19/2002 11:58 PM 3,072 alarm_mgr.idx
09/23/2008 04:26 PM 69,531 alarm_routing.dat
09/23/2008 04:26 PM 387,072 alarm_routing.idx
02/19/2002 11:58 PM 912 alarm_type.dat
02/19/2002 11:58 PM 3,072 alarm_type.idx
02/19/2002 11:58 PM 0 alm_setup.dat
02/19/2002 11:58 PM 4,096 alm_setup.idx
02/19/2002 11:58 PM 0 alm_setup_cl.dat
02/19/2002 11:58 PM 3,072 alm_setup_cl.idx
02/19/2002 11:58 PM 0 alm_setup_fr.dat
02/19/2002 11:58 PM 3,072 alm_setup_fr.idx
02/19/2002 11:58 PM 0 alm_user.dat
02/19/2002 11:58 PM 3,072 alm_user.idx
02/19/2002 11:58 PM 0 alrm_blk_alarm.dat
02/19/2002 11:58 PM 4,096 alrm_blk_alarm.idx
02/19/2002 11:58 PM 0 alrm_blk_group.dat
02/19/2002 11:58 PM 3,072 alrm_blk_group.idx
02/11/1998 04:05 PM 602 amlp.cfg
09/10/2008 06:53 PM 2,415 class.dat
09/10/2008 06:53 PM 3,072 class.idx
02/11/1998 04:06 PM 5 client.cfg
09/10/2008 02:14 PM 393 comm_exe.dat
09/10/2008 02:14 PM 4,096 comm_exe.idx
09/23/2008 03:40 PM 9,893 datalog.dat
09/23/2008 03:40 PM 5,120 datalog.idx
02/19/2002 11:58 PM 1,272 data_field.dat
02/19/2002 11:58 PM 3,072 data_field.idx
09/04/2008 03:10 PM 1,218 dbms_def.dat
09/04/2008 03:10 PM 3,072 dbms_def.idx
09/16/2008 10:45 AM 37,820 derived_point.dat
09/16/2008 10:45 AM 16,384 derived_point.idx
09/10/2008 02:14 PM 256 devcom_proc.dat
09/10/2008 02:14 PM 4,096 devcom_proc.idx
09/10/2008 02:16 PM 1,305 device.dat
09/10/2008 02:16 PM 5,120 device.idx
09/23/2008 04:26 PM 2,243,024 device_point.dat
09/23/2008 04:26 PM 1,745,920 device_point.idx
09/23/2008 04:04 PM 6 dyn_cfg.cfg
02/19/2002 11:58 PM 0 em_alu.dat
02/19/2002 11:58 PM 3,072 em_alu.idx
02/19/2002 11:58 PM 0 es_eu_conv.dat
02/19/2002 11:58 PM 3,072 es_eu_conv.idx
02/19/2002 11:58 PM 0 es_point_info.dat
02/19/2002 11:58 PM 4,096 es_point_info.idx
09/23/2008 04:26 PM 719,712 eu_conv.dat
09/23/2008 04:26 PM 78,848 eu_conv.idx
09/10/2008 06:02 PM 166 event.dat
09/10/2008 06:02 PM 3,072 event.idx
09/10/2008 06:03 PM 121 event_action.dat
09/10/2008 06:03 PM 3,072 event_action.idx
04/30/2013 01:53 PM 0 f.txt
02/19/2002 09:49 PM 199,302 field_def.dat
02/19/2002 09:49 PM 87,040 field_def.idx
09/10/2008 02:15 PM 1,608 fr.dat
09/10/2008 02:15 PM 5,120 fr.idx
07/15/2010 03:41 PM 262 gef_cfg.ini
09/23/2008 03:39 PM 6,435 glb_parms.dat
09/23/2008 03:39 PM 6,144 glb_parms.idx
12/15/1999 11:16 AM 899 ie_deflds.cfg
11/14/2001 11:06 AM 1,101 ie_formats.cfg
02/19/2002 09:49 PM 7,548 keyconst.dat
02/19/2002 09:49 PM 18,432 keyconst.idx
02/19/2002 09:49 PM 16,984 key_field.dat
02/19/2002 09:49 PM 13,312 key_field.idx
02/19/2002 09:49 PM 9,546 lenconst.dat
02/19/2002 09:49 PM 17,408 lenconst.idx
09/10/2008 02:14 PM 990 logproc.dat
09/10/2008 02:14 PM 3,072 logproc.idx
09/23/2008 03:54 PM 47,952 log_attrib.dat
09/23/2008 03:54 PM 77,824 log_attrib.idx
09/23/2008 03:40 PM 1,848 log_event.dat
09/23/2008 03:40 PM 4,096 log_event.idx
08/05/1998 09:04 AM 1,671 log_names.cfg
09/10/2008 02:14 PM 121 master.mcp
07/18/2008 06:32 PM 32 master_mcp.app
09/10/2008 02:14 PM 29 master_mcp.dc
07/18/2008 06:32 PM 52 master_mcp.rp
09/28/2001 02:22 PM 17,449 master_opc_0.ini
02/19/2002 11:58 PM 11,312 meas_assoc.dat
02/19/2002 11:58 PM 8,192 meas_assoc.idx
02/19/2002 11:58 PM 276 meas_system.dat
02/19/2002 11:58 PM 3,072 meas_system.idx
02/19/2002 11:58 PM 1,096 meas_unit.dat
02/19/2002 11:58 PM 3,072 meas_unit.idx
09/10/2008 02:14 PM 365 model.dat
09/10/2008 02:14 PM 4,096 model.idx
07/18/2008 06:32 PM 86 node.dat
07/18/2008 06:32 PM 3,072 node.idx
09/10/2008 02:14 PM 2,167 node_logproc.dat
09/10/2008 02:14 PM 5,120 node_logproc.idx
09/23/2008 04:26 PM 32,890 object.dat
09/23/2008 04:26 PM 28,672 object.idx
09/23/2008 04:26 PM 310,464 object_attrib.dat
09/23/2008 04:26 PM 293,888 object_attrib.idx
09/23/2008 04:26 PM 22,080 object_routing.dat
09/23/2008 04:26 PM 30,720 object_routing.idx
09/10/2008 02:14 PM 715 physproc.dat
09/10/2008 02:14 PM 5,120 physproc.idx
04/26/2010 12:27 PM 2,527,608 point.dat
04/26/2010 12:27 PM 637,952 point.idx
02/19/2002 11:58 PM 95 point_alstr.dat
02/19/2002 11:58 PM 3,072 point_alstr.idx
02/19/2002 11:58 PM 0 point_disp.dat
02/19/2002 11:58 PM 3,072 point_disp.idx
02/19/2002 11:58 PM 194 point_enum.dat
02/19/2002 11:58 PM 3,072 point_enum.idx
02/19/2002 11:58 PM 216 point_enum_fld.dat
02/19/2002 11:58 PM 4,096 point_enum_fld.idx
02/19/2002 11:58 PM 609 point_type.dat
02/19/2002 11:58 PM 3,072 point_type.idx
09/10/2008 02:14 PM 129 port_comm.dat
09/10/2008 02:14 PM 4,096 port_comm.idx
09/10/2008 02:14 PM 42 port_list.dat
09/10/2008 02:14 PM 4,096 port_list.idx
09/10/2008 02:14 PM 294 port_type.dat
09/10/2008 02:14 PM 3,072 port_type.idx
02/11/1998 04:05 PM 5 projects.cfg
09/10/2008 02:14 PM 123 protocol.dat
09/10/2008 02:14 PM 3,072 protocol.idx
02/19/2002 11:58 PM 37 ptmgmt.dat
02/19/2002 11:58 PM 3,072 ptmgmt.idx
04/27/2010 03:25 PM 19,343 ptx_points.cfg
02/19/2002 11:58 PM 0 pt_uf_assoc.dat
02/19/2002 11:58 PM 4,096 pt_uf_assoc.idx
02/19/2002 11:58 PM 93 pt_uf_sets.dat
02/19/2002 11:58 PM 3,072 pt_uf_sets.idx
02/19/2002 11:58 PM 1,568 pt_user_fields.dat
02/19/2002 11:58 PM 4,096 pt_user_fields.idx
02/19/2002 09:49 PM 9,856 record_def.dat
02/19/2002 09:49 PM 13,312 record_def.idx
02/19/2002 11:58 PM 0 redund_addrs.dat
02/19/2002 11:58 PM 4,096 redund_addrs.idx
02/19/2002 11:58 PM 111 role.dat
02/19/2002 11:58 PM 3,072 role.idx
02/19/2002 11:58 PM 159 role_subsys.dat
02/19/2002 11:58 PM 4,096 role_subsys.idx
09/10/2008 02:14 PM 1,067 service.dat
09/10/2008 02:14 PM 5,120 service.idx
02/19/2002 11:58 PM 181 service_use.dat
02/19/2002 11:58 PM 3,072 service_use.idx
09/23/2008 03:57 PM 711 TCPIP0.ini
02/19/2002 11:58 PM 0 trend_pt.dat
02/19/2002 11:58 PM 3,072 trend_pt.idx
09/02/2008 09:29 AM 434 user.dat
09/02/2008 09:29 AM 4,096 user.idx
09/10/2008 02:15 PM 784 user_fr.dat
09/10/2008 02:15 PM 6,144 user_fr.idx
07/15/2010 03:40 PM 15 warmdata.sav
174 File(s) 17,990,662 bytes
UPDATE: Screenshot of point.dat attached:
My condolences, you have been tasked with an almost impossible task.
Ok, I don't know much about CIMPLICITY HMI Plant Edition, save what I can see from the outside. But maybe I can give you some helpful pointers, since I am also developing industrial HMI and Engineering software. (WinCC & PCS7)
First off, the format is, with a certainty of 99%, a proprietary format. The only real help you can expect to get is from GE. The question is, will they give it to you and at what cost. (Since you are trying to extract data, you are probably migrating away; then why should they help.)
So some reverse engineering is probably in order. You want the software that created the project. That is the engineering package and not the HMI run-time software. (May be one piece of software.) Your company should still have licenses for that from the original project.
From there you either create a new project and see what happens and or load the existing project and then play around and see what changes in the data. It is important that you understand what data is in the system and where you can find it.
It is important to note that the data comes in two forms, project configuration and run-time data.
The project configuration is actually not data, this is "just" configuration what the run-time system should display/react. The project data are things like users, views/screens, PLC variable bindings or control scripts. I honestly don't know how you want to put most of this efficiently into CVS or a database. But it is possible WinCC uses MSSQL. (with a default password, thank you stuxnet)
The run-time data is simpler. This are either event logs, that record things like operator commands, alarms and warnings or recorded values. This data should be easily extracted, since it is strictly formatted. To find out what and where was recorded you should open the project in the engineering software, there you should find clues about logging and trends.
Nevertheless you want to get yourself a good hex editor and LOTS of time.
Addendum:
After you added the listing and screenshot, the files make 3/4 sense to me.
For example, alarm_class.* contains the clases of alarms that the run-time may raise, the alarm_type the data types of alarms, alarm_field contains the configured alarms for field alaram, i.e. form the PLC, alarm_rounting are routing or network errors, alm_user are probably user alarms, i.e. from scripts in the HMI runtime.
Everything with point in the name is probably a "measurement point", that is a field device; either a sensor or a feedback from an actuator.
Everything with user, is probably the configured users and their permissions.
redund_addrs is a map of PLC or device addresses that are redundant siblings of primary values.
Everything with port is probably about the "ports" on the SCADA server or PLC, for example FF, PROFIBUS or PROFINET.
Everything with object is probably structures. That is when single variables (aka tags) are bound together for form hierarchical values. For example the value of a sensor and all values about it's status are bound into one structure, this is maybe then treated as one measurement point.
Also when looking at the hex dump you provided, it seams like a structured constant length format.
In this case:
struct Point
{
char name[16] = "$ALARM.ACKED";
char type[32] = "UDINT";
char comment[128] = "...";
};
The values are then padded with space "0x20".
Addendum 2:
You may actually looking at a flat file database, such like dBase and each *.dat is the data and *.idx the table index. Just an idea. It may pay to invest some time into DB technology available circa 2000. Then maybe you can just "dump" the data.