resistics_readers.multifile module

This module has utility functions to help when single recordings are split into multiple data files.

In this scenario, it is possible for each data file to have specfic parameters such as differing levels or scalings required to conver to field units.

Further, it is useful to check that the first and last times of each individual file are continuous and without gaps.

pydantic model resistics_readers.multifile.TimeMetadataSingle[source]

Bases: resistics.time.TimeMetadata

TimeMetadata class for a single file in a multi data file recording

In most cases, individual data formats may inherit from this as there could be other parameters that are useful to save per data file.

Show JSON schema
{
   "title": "TimeMetadataSingle",
   "description": "TimeMetadata class for a single file in a multi data file recording\n\nIn most cases, individual data formats may inherit from this as there could\nbe other parameters that are useful to save per data file.",
   "type": "object",
   "properties": {
      "file_info": {
         "$ref": "#/definitions/ResisticsFile"
      },
      "fs": {
         "title": "Fs",
         "type": "number"
      },
      "chans": {
         "title": "Chans",
         "type": "array",
         "items": {
            "type": "string"
         }
      },
      "n_chans": {
         "title": "N Chans",
         "type": "integer"
      },
      "n_samples": {
         "title": "N Samples",
         "type": "integer"
      },
      "first_time": {
         "title": "First Time",
         "pattern": "%Y-%m-%d %H:%M:%S.%f_%o_%q_%v",
         "examples": [
            "2021-01-01 00:00:00.000061_035156_250000_000000"
         ]
      },
      "last_time": {
         "title": "Last Time",
         "pattern": "%Y-%m-%d %H:%M:%S.%f_%o_%q_%v",
         "examples": [
            "2021-01-01 00:00:00.000061_035156_250000_000000"
         ]
      },
      "system": {
         "title": "System",
         "default": "",
         "type": "string"
      },
      "serial": {
         "title": "Serial",
         "default": "",
         "type": "string"
      },
      "wgs84_latitude": {
         "title": "Wgs84 Latitude",
         "default": -999.0,
         "type": "number"
      },
      "wgs84_longitude": {
         "title": "Wgs84 Longitude",
         "default": -999.0,
         "type": "number"
      },
      "easting": {
         "title": "Easting",
         "default": -999.0,
         "type": "number"
      },
      "northing": {
         "title": "Northing",
         "default": -999.0,
         "type": "number"
      },
      "elevation": {
         "title": "Elevation",
         "default": -999.0,
         "type": "number"
      },
      "chans_metadata": {
         "title": "Chans Metadata",
         "type": "object",
         "additionalProperties": {
            "$ref": "#/definitions/ChanMetadata"
         }
      },
      "history": {
         "title": "History",
         "default": {
            "records": []
         },
         "allOf": [
            {
               "$ref": "#/definitions/History"
            }
         ]
      },
      "data_file": {
         "title": "Data File",
         "type": "string"
      }
   },
   "required": [
      "fs",
      "chans",
      "n_samples",
      "first_time",
      "last_time",
      "chans_metadata",
      "data_file"
   ],
   "definitions": {
      "ResisticsFile": {
         "title": "ResisticsFile",
         "description": "Required information for writing out a resistics file",
         "type": "object",
         "properties": {
            "created_on_local": {
               "title": "Created On Local",
               "type": "string",
               "format": "date-time"
            },
            "created_on_utc": {
               "title": "Created On Utc",
               "type": "string",
               "format": "date-time"
            },
            "version": {
               "title": "Version",
               "type": "string"
            }
         }
      },
      "ChanMetadata": {
         "title": "ChanMetadata",
         "description": "Channel metadata",
         "type": "object",
         "properties": {
            "name": {
               "title": "Name",
               "type": "string"
            },
            "data_files": {
               "title": "Data Files",
               "type": "array",
               "items": {
                  "type": "string"
               }
            },
            "chan_type": {
               "title": "Chan Type",
               "type": "string"
            },
            "chan_source": {
               "title": "Chan Source",
               "type": "string"
            },
            "sensor": {
               "title": "Sensor",
               "default": "",
               "type": "string"
            },
            "serial": {
               "title": "Serial",
               "default": "",
               "type": "string"
            },
            "gain1": {
               "title": "Gain1",
               "default": 1,
               "type": "number"
            },
            "gain2": {
               "title": "Gain2",
               "default": 1,
               "type": "number"
            },
            "scaling": {
               "title": "Scaling",
               "default": 1,
               "type": "number"
            },
            "chopper": {
               "title": "Chopper",
               "default": false,
               "type": "boolean"
            },
            "dipole_dist": {
               "title": "Dipole Dist",
               "default": 1,
               "type": "number"
            },
            "sensor_calibration_file": {
               "title": "Sensor Calibration File",
               "default": "",
               "type": "string"
            },
            "instrument_calibration_file": {
               "title": "Instrument Calibration File",
               "default": "",
               "type": "string"
            }
         },
         "required": [
            "name"
         ]
      },
      "Record": {
         "title": "Record",
         "description": "Class to hold a record\n\nA record holds information about a process that was run. It is intended to\ntrack processes applied to data, allowing a process history to be saved\nalong with any datasets.\n\nExamples\n--------\nA simple example of creating a process record\n\n>>> from resistics.common import Record\n>>> messages = [\"message 1\", \"message 2\"]\n>>> record = Record(\n...     creator={\"name\": \"example\", \"parameter1\": 15},\n...     messages=messages,\n...     record_type=\"example\"\n... )\n>>> record.summary()\n{\n    'time_local': '...',\n    'time_utc': '...',\n    'creator': {'name': 'example', 'parameter1': 15},\n    'messages': ['message 1', 'message 2'],\n    'record_type': 'example'\n}",
         "type": "object",
         "properties": {
            "time_local": {
               "title": "Time Local",
               "type": "string",
               "format": "date-time"
            },
            "time_utc": {
               "title": "Time Utc",
               "type": "string",
               "format": "date-time"
            },
            "creator": {
               "title": "Creator",
               "type": "object"
            },
            "messages": {
               "title": "Messages",
               "type": "array",
               "items": {
                  "type": "string"
               }
            },
            "record_type": {
               "title": "Record Type",
               "type": "string"
            }
         },
         "required": [
            "creator",
            "messages",
            "record_type"
         ]
      },
      "History": {
         "title": "History",
         "description": "Class for storing processing history\n\nParameters\n----------\nrecords : List[Record], optional\n    List of records, by default []\n\nExamples\n--------\n>>> from resistics.testing import record_example1, record_example2\n>>> from resistics.common import History\n>>> record1 = record_example1()\n>>> record2 = record_example2()\n>>> history = History(records=[record1, record2])\n>>> history.summary()\n{\n    'records': [\n        {\n            'time_local': '...',\n            'time_utc': '...',\n            'creator': {\n                'name': 'example1',\n                'a': 5,\n                'b': -7.0\n            },\n            'messages': ['Message 1', 'Message 2'],\n            'record_type': 'process'\n        },\n        {\n            'time_local': '...',\n            'time_utc': '...',\n            'creator': {\n                'name': 'example2',\n                'a': 'parzen',\n                'b': -21\n            },\n            'messages': ['Message 5', 'Message 6'],\n            'record_type': 'process'\n        }\n    ]\n}",
         "type": "object",
         "properties": {
            "records": {
               "title": "Records",
               "default": [],
               "type": "array",
               "items": {
                  "$ref": "#/definitions/Record"
               }
            }
         }
      }
   }
}

field data_file: str = PydanticUndefined

The name of a single data file in a multi file recording

pydantic model resistics_readers.multifile.TimeMetadataMerge[source]

Bases: resistics.time.TimeMetadata

This is an extension of TimeMetadata for situations where a single continuous recording has been split into multiple data files.

This is applicable to SPAM data as well as Lemi B423 for example.

To keep track of the metadata about all the contributing files, this extension to TimeMetadata adds a dictionary storing file specific details such as first and last time, specific scalings or reading parameters etc.

Show JSON schema
{
   "title": "TimeMetadataMerge",
   "description": "This is an extension of TimeMetadata for situations where a single\ncontinuous recording has been split into multiple data files.\n\nThis is applicable to SPAM data as well as Lemi B423 for example.\n\nTo keep track of the metadata about all the contributing files, this\nextension to TimeMetadata adds a dictionary storing file specific details\nsuch as first and last time, specific scalings or reading parameters etc.",
   "type": "object",
   "properties": {
      "file_info": {
         "$ref": "#/definitions/ResisticsFile"
      },
      "fs": {
         "title": "Fs",
         "type": "number"
      },
      "chans": {
         "title": "Chans",
         "type": "array",
         "items": {
            "type": "string"
         }
      },
      "n_chans": {
         "title": "N Chans",
         "type": "integer"
      },
      "n_samples": {
         "title": "N Samples",
         "type": "integer"
      },
      "first_time": {
         "title": "First Time",
         "pattern": "%Y-%m-%d %H:%M:%S.%f_%o_%q_%v",
         "examples": [
            "2021-01-01 00:00:00.000061_035156_250000_000000"
         ]
      },
      "last_time": {
         "title": "Last Time",
         "pattern": "%Y-%m-%d %H:%M:%S.%f_%o_%q_%v",
         "examples": [
            "2021-01-01 00:00:00.000061_035156_250000_000000"
         ]
      },
      "system": {
         "title": "System",
         "default": "",
         "type": "string"
      },
      "serial": {
         "title": "Serial",
         "default": "",
         "type": "string"
      },
      "wgs84_latitude": {
         "title": "Wgs84 Latitude",
         "default": -999.0,
         "type": "number"
      },
      "wgs84_longitude": {
         "title": "Wgs84 Longitude",
         "default": -999.0,
         "type": "number"
      },
      "easting": {
         "title": "Easting",
         "default": -999.0,
         "type": "number"
      },
      "northing": {
         "title": "Northing",
         "default": -999.0,
         "type": "number"
      },
      "elevation": {
         "title": "Elevation",
         "default": -999.0,
         "type": "number"
      },
      "chans_metadata": {
         "title": "Chans Metadata",
         "type": "object",
         "additionalProperties": {
            "$ref": "#/definitions/ChanMetadata"
         }
      },
      "history": {
         "title": "History",
         "default": {
            "records": []
         },
         "allOf": [
            {
               "$ref": "#/definitions/History"
            }
         ]
      },
      "data_table": {
         "title": "Data Table",
         "type": "object"
      }
   },
   "required": [
      "fs",
      "chans",
      "n_samples",
      "first_time",
      "last_time",
      "chans_metadata",
      "data_table"
   ],
   "definitions": {
      "ResisticsFile": {
         "title": "ResisticsFile",
         "description": "Required information for writing out a resistics file",
         "type": "object",
         "properties": {
            "created_on_local": {
               "title": "Created On Local",
               "type": "string",
               "format": "date-time"
            },
            "created_on_utc": {
               "title": "Created On Utc",
               "type": "string",
               "format": "date-time"
            },
            "version": {
               "title": "Version",
               "type": "string"
            }
         }
      },
      "ChanMetadata": {
         "title": "ChanMetadata",
         "description": "Channel metadata",
         "type": "object",
         "properties": {
            "name": {
               "title": "Name",
               "type": "string"
            },
            "data_files": {
               "title": "Data Files",
               "type": "array",
               "items": {
                  "type": "string"
               }
            },
            "chan_type": {
               "title": "Chan Type",
               "type": "string"
            },
            "chan_source": {
               "title": "Chan Source",
               "type": "string"
            },
            "sensor": {
               "title": "Sensor",
               "default": "",
               "type": "string"
            },
            "serial": {
               "title": "Serial",
               "default": "",
               "type": "string"
            },
            "gain1": {
               "title": "Gain1",
               "default": 1,
               "type": "number"
            },
            "gain2": {
               "title": "Gain2",
               "default": 1,
               "type": "number"
            },
            "scaling": {
               "title": "Scaling",
               "default": 1,
               "type": "number"
            },
            "chopper": {
               "title": "Chopper",
               "default": false,
               "type": "boolean"
            },
            "dipole_dist": {
               "title": "Dipole Dist",
               "default": 1,
               "type": "number"
            },
            "sensor_calibration_file": {
               "title": "Sensor Calibration File",
               "default": "",
               "type": "string"
            },
            "instrument_calibration_file": {
               "title": "Instrument Calibration File",
               "default": "",
               "type": "string"
            }
         },
         "required": [
            "name"
         ]
      },
      "Record": {
         "title": "Record",
         "description": "Class to hold a record\n\nA record holds information about a process that was run. It is intended to\ntrack processes applied to data, allowing a process history to be saved\nalong with any datasets.\n\nExamples\n--------\nA simple example of creating a process record\n\n>>> from resistics.common import Record\n>>> messages = [\"message 1\", \"message 2\"]\n>>> record = Record(\n...     creator={\"name\": \"example\", \"parameter1\": 15},\n...     messages=messages,\n...     record_type=\"example\"\n... )\n>>> record.summary()\n{\n    'time_local': '...',\n    'time_utc': '...',\n    'creator': {'name': 'example', 'parameter1': 15},\n    'messages': ['message 1', 'message 2'],\n    'record_type': 'example'\n}",
         "type": "object",
         "properties": {
            "time_local": {
               "title": "Time Local",
               "type": "string",
               "format": "date-time"
            },
            "time_utc": {
               "title": "Time Utc",
               "type": "string",
               "format": "date-time"
            },
            "creator": {
               "title": "Creator",
               "type": "object"
            },
            "messages": {
               "title": "Messages",
               "type": "array",
               "items": {
                  "type": "string"
               }
            },
            "record_type": {
               "title": "Record Type",
               "type": "string"
            }
         },
         "required": [
            "creator",
            "messages",
            "record_type"
         ]
      },
      "History": {
         "title": "History",
         "description": "Class for storing processing history\n\nParameters\n----------\nrecords : List[Record], optional\n    List of records, by default []\n\nExamples\n--------\n>>> from resistics.testing import record_example1, record_example2\n>>> from resistics.common import History\n>>> record1 = record_example1()\n>>> record2 = record_example2()\n>>> history = History(records=[record1, record2])\n>>> history.summary()\n{\n    'records': [\n        {\n            'time_local': '...',\n            'time_utc': '...',\n            'creator': {\n                'name': 'example1',\n                'a': 5,\n                'b': -7.0\n            },\n            'messages': ['Message 1', 'Message 2'],\n            'record_type': 'process'\n        },\n        {\n            'time_local': '...',\n            'time_utc': '...',\n            'creator': {\n                'name': 'example2',\n                'a': 'parzen',\n                'b': -21\n            },\n            'messages': ['Message 5', 'Message 6'],\n            'record_type': 'process'\n        }\n    ]\n}",
         "type": "object",
         "properties": {
            "records": {
               "title": "Records",
               "default": [],
               "type": "array",
               "items": {
                  "$ref": "#/definitions/Record"
               }
            }
         }
      }
   }
}

field data_table: Dict[str, Any] = PydanticUndefined

The data table that will help with scaling and selecting data files

resistics_readers.multifile.validate_consistency(dir_path: pathlib.Path, metadata_list: List[resistics.time.TimeMetadata])bool[source]

Validate multi file metadata with each other to ensure they are consistent

This function checks:

  • Matching sampling frequency

  • Matching number of chans

  • Matching channels

Parameters
  • dir_path (Path) – The data path

  • metadata_list (List[TimeMetadata]) – List of TimeMetadata, one for each data file in a continuous recording

Returns

True if validation was successful, otherwise raises an Exception

Return type

bool

Raises
  • MetadataReadError – If multiple values are found for sampling frequency

  • MetadataReadError – If multiple values are found for number of chans

  • MetadataReadError – If different XTR files have different channels

resistics_readers.multifile.validate_continuous(dir_path: pathlib.Path, metadata_list: List[resistics_readers.multifile.TimeMetadataSingle])bool[source]

Validate that metadata is continuous

For data formats such as SPAM and Lemi B423 which separate a single continuous recording into multiple data files, it needs to be validated that there is no missing data.

This function validates that metadata from each individual data file does define a single continuous recording with no missing data.

Parameters
  • dir_path (Path) – The directory path

  • metadata_list (List[TimeMetadata]) – List of TimeMetadata with metadata from a set of data files that constitute a single continuous recording

Returns

True if recording is continuous

Return type

bool

Raises

MetadataReadError – If gaps were found

resistics_readers.multifile.add_cumulative_samples(df: pandas.core.frame.DataFrame)pandas.core.frame.DataFrame[source]

Add cumulative samples to data table

This is useful for multi file recordings and helping to decide which files to read to get a certain time range

Parameters

df (pd.DataFrame) – Data table with an n_samples column

Returns

Data table with a first_sample and last_sample column added

Return type

pd.DataFrame

resistics_readers.multifile.samples_to_sources(dir_path: pathlib.Path, df: pandas.core.frame.DataFrame, from_sample: int, to_sample: int)pandas.core.frame.DataFrame[source]

Find the data sources for a sample range

This can be used for a multi-file measurement or for a measurement that is split up into multiple records. It maps a sample range defined by from_sample and to_sample to the sources and returns a DataFrame providing information about the samples that need to be read from each source (file or record) to cover the range.

Parameters
  • dir_path (Path) – The directory with the data

  • df (pd.DataFrame) – Table of all the sources and their sample ranges

  • from_sample (int) – Reading from sample

  • to_sample (int) – Reading to sample

Returns

DataFrame with data files to read as indices and reading information as columns such as number of samples to read, channel scalings etc.

Return type

pd.DataFrame

Raises

TimeDataReadError – If somehow there’s a mismatch in the total number of samples to read per file and the expected number of samples.