Synthetic Data

class rackio_AI.preprocessing.SyntheticData()

This class allows you to add anomalies to a data to model the behavior of a data that comes from the field You can add the following anomalies:

  • Gaussian noise
  • Outliers
  • Frozen data
  • Excessive noise
  • Out of range data
  • Instrument decalibration
  • Sensor drift
add_instrument_error(self)

Add insturment error according the error and repeteability instrument

Parameters

None

:return:

  • data: (np.ndarray) data with instrument error
add_decalibration(self, decalibration_factor=2.0, duration=10)

Add instrument decalibration to the data

Parameters

  • decalibration_factor: (float) default=2.0: Instrument error amplitude respect to original instrument error
  • duration: (int) default=10: Values followed in the data in which the anomaly will be maintained

return

  • data: (np.ndarray) data with instrument decalibration
add_sensor_drift(self, sensor_drift_factor=5.0, duration=10)

Adding sensor drift anomaly to the data

Parameters

  • sensor_drift_factor: (float) default=5.0: Instrument sensor drift amplitude respect to original instrument error
  • duration: (int) default=10: Values followed in the data in which the anomaly will be maintained

return

  • data: (np.ndarray) data with sensor drift
add_excessive_noise(self, error_factor=3.0, repeteability_factor=3.0, duration=10)

Adding excessive gaussian noise anomaly to the data

Parameters

  • error_factor: (float) default=3.0: Instrument error amplitude respect to original instrument error
  • repeteability_factor: (float) default=3.0: Instrument repeteability amplitude respect to original repeteability
  • duration: (int) default=10: Values followed in the data in which the anomaly will be maintained

return

  • data: (np.ndarray) data with excessive white noise
add_frozen_data(self, duration=10)

Adding frozing anomaly to the data

Parameters

  • duration: (int) default=10: Values followed in the data in which the anomaly will be maintained

return

  • data: (np.ndarray) data with frozen instrument
add_outlier(self, span_factor=0.03)

Adding outlier anomaly to the data

Parameters

  • span_factor: (float) default=0.03: Fraction respect to instrument range to add to the data

return

  • data: (np.ndarray) data with outlier
add_out_of_range(self, duration=10)

Adding out of range anomaly to the data

Parameters

  • duration: (int) default=10: Values followed in the data in which the anomaly will be maintained

return

  • data: (np.ndarray) data with out of range
call(self, decalibrations=None, sensor_drift=None, excesive_noise=None, frozen_data=None, outliers=None, out_of_range=None, add_WN=False, columns_names=[], **options)

Callback to do anomalies


Parameters

  • decalibrations: (int) default=0: decalibration anomalies to add
  • sensor_drift: (int) default=0: sensor drift anomalies to add
  • excesive_noise: (int) default=0: excesive noise anomalies to add
  • frozen_data: (int) default=0: frozen data anomalies to add
  • outliers: (int) default=0: outlier anomalies to add
  • out_of_range: (int) default=0: out of range anomalies to add
  • add_WN: (bool) default=False: add or not add error instrumentation
  • options:
    • duration: (dict)
      • min: (int) default=10
      • max: (int) default=50
    • view: (bool) default=False
    • columns: (list[int]) default=[0]

return:

  • data: (np.array, pd.DataFrame): data with anomalies

Snippet code

>>> import os
>>> from rackio_AI import RackioAI, get_directory, SyntheticData
>>> filename = os.path.join(get_directory('pkl_files'), 'test_data.pkl')
>>> data = RackioAI.load(filename)
>>> error = [0.0025, 0.0025, 0.0025, 0.0025]
>>> repeteability = [0.001, 0.001, 0.001, 0.001]
>>> lower_limit = [0, 0, 400000, 100000]
>>> upper_limit = [500, 500, 1200000, 600000]
>>> dead_band = [0.001, 0.001, 0.001, 0.001]
>>> SD = SyntheticData()
>>> SD.add_attributes(error=error, repeteability=repeteability, lower_limit=lower_limit, upper_limit=upper_limit, dead_band=dead_band)
>>> data = SD(frozen_data=2, out_of_range=1, add_WN=True, view=False, columns=[0,1,2,3], duration={'min': 20, 'max': 100})
done(self, view=False, **options)

This method allows to you plot the anomalies added to the data

Parameters

  • view: (bool) default=False: If False no plot, True plot
  • options:
    • columns: (list[int]) columns to plot in a list
    • ylabel: (str) ylabel string
    • xlabel: (str) xlabel string

return

None
round_by_dead_band(self)

Round data according to the instrument dead band

Parameters

  • data: (np.array)

return

  • data: (np.array): round data applied
view(self, columns=[0], xlabel='Time', ylabel='Amplitude')

Plot the data with anomalies added

Paramters

  • columns: (list) default=[0]:
  • xlabel: (str) default='Time'
  • ylabel: (str) default='Amplitude'

return

None