Basics of Lab Data Files

The lab data generated by ALGL is delivered to clients in a variety of formats for the use in software. One key distinction about these data files is that it is lab data only. There are data file formats that are dedicated to the spatial representation of lab data. We at the laboratory do not generate these files. In most cases the GPS coordinates of the sample location is not shared with the lab.

The most common data files are CSV files. They are identified by their file extension, .csv and are commonly called “comma delimited” files. These are basic text files in which the data for a given sample is contained in a single line of text, and each piece of data is separated by a comma. These files can be viewed, opened, modified in spreadsheet programs like Excel and text applications like Notepad, but must be handled within specific parameters to maintain the integrity of the file structure. Often critical metadata, which is information that provides context to allow the data to be understood by the end user, is not contained in the data file. This metadata includes such information as the type of data being presented (analyte), units, and extraction method. This is especially true because not all software packages require this information to be explicitly given within a data file.  

A growing data file format is the Modus-xml file. Modus is a standardized system of defined terminology, metadata, and file structures that has grown from a need to manage and exchange agricultural testing data. The file format follows an XML structure, which is essentially a coding language. The Modus files have a standardized data structure and use a preset list of codes to identify all parameters of the sample such as lab method and units.

We often get requests for shape files. This is not a single file, but rather a set of files, that are often grouped in a ZIP file. Each data set is contained in three identically named files with different file extensions of .shp, .dbf, and .shx. Each file contains an aspect of the complete data set. These files are specific to GPS/GIS mapped data. The GPS coordinates of the sample location are required to generate these maps.

Often software companies electronically share data about a soil sample or set of soil samples before they arrive at the lab. This data can be as simple as a grower/farm/field, or very detailed, and is often determined by the software’s data flow. In most cases a unique identifier (serial number) is assigned to the sample or set of samples. That unique identifier accompanies the samples to the lab and is used to link the electronic data that was sent by the software ahead of time to the physical sample.  This can also provide more efficient data entry for the customer and the lab.

When the lab results are released, they are sent to the customer in a couple of ways. Some data is simply emailed to the customer for manual import. Some data is emailed to a server, and the server automates the uploading of the data to the software. Data can also be sent directly from our server through an automatic interface that processes the data and imports it into the software platform.

