I2Regression

Landsat5_old

Name	Default Value	Description	Type	Mandatory	Name
additional_features		OTB’s bandmath expressions, separated by comma.	str	False	additional_features
end_date		The last date of interpolated image time series : YYYYMMDD format.	str	False	end_date
keep_bands	[‘B1’, ‘B2’, ‘B3’, ‘B4’, ‘B5’, ‘B6’, ‘B7’]	The list of spectral bands used for classification.	list	False	keep_bands
start_date		The first date of interpolated image time series : YYYYMMDD format.	str	False	start_date
temporal_resolution	10	The temporal gap between two interpolations.	int	False	temporal_resolution
write_reproject_resampled_input_dates_stack	True	Flag to write resampled stack image for each date.	bool	False	write_reproject_resampled_input_dates_stack

Notes

end_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

keep_bands 

WARNING

For this parameter to be taken into account, the extract_bands variable in the iota2_feature_extraction section must also be set to True:

iota2_feature_extraction :
{
  'extract_bands':True,
}

start_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

Landsat8

Name	Default Value	Description	Type	Mandatory	Name
additional_features		OTB’s bandmath expressions, separated by comma.	str	False	additional_features
end_date		The last date of interpolated image time series : YYYYMMDD format.	str	False	end_date
keep_bands	[‘B1’, ‘B2’, ‘B3’, ‘B4’, ‘B5’, ‘B6’, ‘B7’]	The list of spectral bands used for classification.	list	False	keep_bands
start_date		The first date of interpolated image time series : YYYYMMDD format.	str	False	start_date
temporal_resolution	16	The temporal gap between two interpolations.	int	False	temporal_resolution
write_reproject_resampled_input_dates_stack	True	Flag to write resampled stack image for each date.	bool	False	write_reproject_resampled_input_dates_stack

Notes

end_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

keep_bands 

WARNING

For this parameter to be taken into account, the extract_bands variable in the iota2_feature_extraction section must also be set to True:

iota2_feature_extraction :
{
  'extract_bands':True,
}

start_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

Landsat8_old

Name	Default Value	Description	Type	Mandatory	Name
additional_features		OTB’s bandmath expressions, separated by comma.	str	False	additional_features
end_date		The last date of interpolated image time series : YYYYMMDD format.	str	False	end_date
keep_bands	[‘B1’, ‘B2’, ‘B3’, ‘B4’, ‘B5’, ‘B6’, ‘B7’]	The list of spectral bands used for classification.	list	False	keep_bands
start_date		The first date of interpolated image time series : YYYYMMDD format.	str	False	start_date
temporal_resolution	10	The temporal gap between two interpolations.	int	False	temporal_resolution
write_reproject_resampled_input_dates_stack	True	Flag to write resampled stack image for each date.	bool	False	write_reproject_resampled_input_dates_stack

Notes

end_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

keep_bands 

WARNING

For this parameter to be taken into account, the extract_bands variable in the iota2_feature_extraction section must also be set to True:

iota2_feature_extraction :
{
  'extract_bands':True,
}

start_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

Sentinel_2

Name	Default Value	Description	Type	Mandatory	Name
additional_features		OTB’s bandmath expressions, separated by comma.	str	False	additional_features
end_date		The last date of interpolated image time series : YYYYMMDD format.	str	False	end_date
keep_bands	[‘B1’, ‘B2’, ‘B3’, ‘B4’, ‘B5’, ‘B6’, ‘B7’]	The list of spectral bands used for classification.	list	False	keep_bands
start_date		The first date of interpolated image time series : YYYYMMDD format.	str	False	start_date
temporal_resolution	10	The temporal gap between two interpolations.	int	False	temporal_resolution
write_reproject_resampled_input_dates_stack	True	Flag to write resampled stack image for each date.	bool	False	write_reproject_resampled_input_dates_stack

Notes

end_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

keep_bands 

WARNING

For this parameter to be taken into account, the extract_bands variable in the iota2_feature_extraction section must also be set to True:

iota2_feature_extraction :
{
  'extract_bands':True,
}

start_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

Sentinel_2_L3A

Name	Default Value	Description	Type	Mandatory	Name
additional_features		OTB’s bandmath expressions, separated by comma.	str	False	additional_features
end_date		The last date of interpolated image time series : YYYYMMDD format.	str	False	end_date
keep_bands	[‘B1’, ‘B2’, ‘B3’, ‘B4’, ‘B5’, ‘B6’, ‘B7’]	The list of spectral bands used for classification.	list	False	keep_bands
start_date		The first date of interpolated image time series : YYYYMMDD format.	str	False	start_date
temporal_resolution	10	The temporal gap between two interpolations.	int	False	temporal_resolution
write_reproject_resampled_input_dates_stack	True	Flag to write resampled stack image for each date.	bool	False	write_reproject_resampled_input_dates_stack

Notes

end_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

keep_bands 

WARNING

For this parameter to be taken into account, the extract_bands variable in the iota2_feature_extraction section must also be set to True:

iota2_feature_extraction :
{
  'extract_bands':True,
}

start_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

Sentinel_2_S2C

Name	Default Value	Description	Type	Mandatory	Name
additional_features		OTB’s bandmath expressions, separated by comma.	str	False	additional_features
end_date		The last date of interpolated image time series : YYYYMMDD format.	str	False	end_date
keep_bands	[‘B1’, ‘B2’, ‘B3’, ‘B4’, ‘B5’, ‘B6’, ‘B7’]	The list of spectral bands used for classification.	list	False	keep_bands
start_date		The first date of interpolated image time series : YYYYMMDD format.	str	False	start_date
temporal_resolution	10	The temporal gap between two interpolations.	int	False	temporal_resolution
write_reproject_resampled_input_dates_stack	True	Flag to write resampled stack image for each date.	bool	False	write_reproject_resampled_input_dates_stack

Notes

end_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

keep_bands 

WARNING

For this parameter to be taken into account, the extract_bands variable in the iota2_feature_extraction section must also be set to True:

iota2_feature_extraction :
{
  'extract_bands':True,
}

start_date 

WARNING

For this parameter to be taken into account, the auto_date variable in the sensors_data_interpolation section must also be set to False:

sensors_data_interpolation :
{
  'auto_date':False,
}

arg_train

Name	Default Value	Description	Type	Mandatory	Name
deep_learning_parameters	{}	Deep learning parameter description is available here.	dict	False	deep_learning_parameters
features	[‘NDVI’, ‘NDWI’, ‘Brightness’]	List of additional features computed.	list	False	features
learning_samples_extension	sqlite	Learning samples file extension, possible values are ‘sqlite’ and ‘csv’.	str	False	learning_samples_extension
random_seed	None	Fix the random seed for random split of reference data.	int	False	random_seed
ratio	0.5	Should be between 0.0 and 1.0 and represents the proportion of the dataset to include in the train split.	float	False	ratio
runs	1	Number of independent runs processed.	int	False	runs
sample_augmentation	{‘activate’: False, ‘bins’: 10}	OTB parameters for sample augmentation.	dict	False	sample_augmentation
sample_management	None	Absolute path to a CSV file containing samples transfer strategies.	str	False	sample_management
sample_selection	{‘sampler’: ‘random’, ‘strategy’: ‘all’}	OTB parameters for sampling the validation set.	dict	False	sample_selection
sample_validation	{‘sampler’: ‘random’, ‘strategy’: ‘all’}	OTB parameters for sampling the validation set.	dict	False	sample_validation
sampling_validation	False	Enable sampling validation.	bool	False	sampling_validation

Notes

features 

This parameter enables the computation of the three indices if available for the sensor used. There is no choice for using only one of them.

learning_samples_extension 

Default value is ‘sqlite’ (faster). If the number of features is greater than 2000, it should be set to ‘csv’ as sqlite file doesn’t accept more than 2000 columns.

random_seed 

Fix the random seed used for random split of reference data. If set, the results must be the same for a given classifier.

runs 

Number of independent runs processed. Each run has its own learning samples. Must be an integer greater than 0.

sample_augmentation 

In supervised classification the balance between class samples is important. There are many ways to manage class balancing in iota2, using sample_selection or the classifier’s options to limit the number of samples by class. An other approach is to generate synthetic samples. It is the purpose of this functionality, which is called sample augmentation’.
{'activate':False}

Example

sample_augmentation : {
   'target_models' : ['1', '2'],
   'strategy' : 'jitter',
   'strategy.jitter.stdfactor' : 10,
   'strategy.smote.neighbors' : 5,
   'samples.strategy' : 'balance',
   'activate' : True
}
iota2 implements an interface to the OTB SampleAugmentation application. There are three methods to generate samples : replicate, jitter and smote. The documentation here explains the difference between these approaches.

samples.strategy specifies how many samples must be created. There are 3 different strategies:

minNumber
To set the minimum number of samples by class required.

balance
Balance all classes with the same number of samples as the majority one.

byClass
Augment only some of the classes

Parameters related to minNumber and byClass strategies are:

samples.strategy.minNumber
Minimum number of samples.

samples.strategy.byClass
Path to a CSV file containing in first column the class’s abel and in the second column the minimum number of samples required.

In the above example, classes of models ‘1’ and ‘2’ will be augmented to the most represented class in the corresponding model using the jitter method.

To perform sample augmentation for regression problems, we return to a configuration similar to classification. For that purpose we create fake classes using the bins parameter. The bins parameter can be an integer in which case the interval of values of the target output variable is divided into bins sub-intervals of equal width, and each sample gets a fake a class corresponding to the number of the interval in which its label fell.

bins can also be a list of ascending values used as interval boundaries to assign to the classes.

sample_management 

The CSV must contain a row per transfer:
>>> cat /absolute/path/myRules.csv
    1,2,4,2
Meaning:

source

destination

class name

quantity

1

2

4

2

Currently, setting the ‘random_seed’ parameter has no effect on this workflow.

sample_selection 

This field parameters the strategy of polygon sampling. It directly refers to options of OTB’s SampleSelection application.

Example

"sample_selection": {
        "sampler": "random",
        "strategy": "percent",
        "strategy.percent.p": 0.2,
        "per_models": [
            {
                "target_model": "4",
                "sampler": "periodic"
            }
        ]
    }

In the example above, all polygons will be sampled with the 20% ratio. But the polygons which belong to the model 4 will be periodically sampled, instead of the ransom sampling used for other polygons. Notice than per_models key contains a list of strategies. Then we can imagine the following :

"sample_selection": {
    "sampler": "random",
    "strategy": "percent",
    "strategy.percent.p": 0.2,
    "per_models": [
        {
            "target_model": "4",
            "sampler": "periodic"
        },
        {
            "target_model": "1",
            "sampler": "random",
            "strategy": "byclass",
            "strategy.byclass.in": "/path/to/myCSV.csv"
        }
    ]
}

Where the first column of /path/to/myCSV.csv is class label (integer), second one is the required samples number (integer).

sample_validation 

This field parameters the strategy of polygon sampling. It directly refers to options of OTB’s SampleSelection application.

Example

"sample_selection": {
        "sampler": "random",
        "strategy": "percent",
        "strategy.percent.p": 0.2,
        "per_models": [
            {
                "target_model": "4",
                "sampler": "periodic"
            }
        ]
    }

In the example above, all polygons will be sampled with the 20% ratio. But the polygons which belong to the model 4 will be periodically sampled, instead of the ransom sampling used for other polygons. Notice than per_models key contains a list of strategies. Then we can imagine the following :

"sample_selection": {
    "sampler": "random",
    "strategy": "percent",
    "strategy.percent.p": 0.2,
    "per_models": [
        {
            "target_model": "4",
            "sampler": "periodic"
        },
        {
            "target_model": "1",
            "sampler": "random",
            "strategy": "byclass",
            "strategy.byclass.in": "/path/to/myCSV.csv"
        }
    ]
}

Where the first column of /path/to/myCSV.csv is class label (integer), second one is the required samples number (integer).

builders

Name	Default Value	Description	Type	Mandatory	Name
builders_class_name	[‘I2Classification’]	The name of the class defining the builder.	list	False	builders_class_name
builders_paths	/path/to/iota2/sources	The path to user builders.	str	False	builders_paths

Notes

builders_class_name 

Available builders are : ‘I2Classification’, ‘I2FeaturesMap’ and ‘I2Obia’.

builders_paths 

If not indicated, the iota2 source directory is used: */iota2/sequence_builders/.

chain

Name	Default Value	Description	Type	Mandatory	Name
check_inputs	True	Enable the inputs verification	bool	False	check_inputs
cloud_threshold	0	Threshold to consider that a pixel is valid.	int	False	cloud_threshold
data_field	None	Field name indicating classes labels in ground_truth	str	True	data_field
first_step	None	The step group name indicating where the chain starts.	str	False	first_step
ground_truth	None	Absolute path to reference data.	str	True	ground_truth
last_step	None	The step group name indicating where the chain ends.	str	False	last_step
list_tile	None	List of tiles to process, separated by space.	str	True	list_tile
logger_level	INFO	Set the logger level: NOTSET, DEBUG, INFO, WARNING, ERROR, CRITICAL.	str	False	logger_level
minimum_required_dates	2	Required minimum number of available dates for each sensor.	int	False	minimum_required_dates
output_path	None	Absolute path to the output directory	str	True	output_path
proj	None	The projection wanted. Format EPSG:XXXX is mandatory.	str	True	proj
region_field	region	The column name for region indicator in`region_path` file.	str	False	region_field
region_path	None	Absolute path to a region vector file.	str	False	region_path
remove_output_path	True	Before the launch of iota2, remove the content of output_path.	bool	False	remove_output_path
s1_path	None	Absolute path to Sentinel-1 configuration file.	str	False	s1_path
s2_l3a_output_path	None	Absolute path to store preprocessed data in a dedicated directory.	str	False	s2_l3a_output_path
s2_l3a_path	None	Absolute path to Sentinel-2 L3A images (THEIA format).	str	False	s2_l3a_path
s2_output_path	None	Absolute path to store preprocessed data in a dedicated directory.	str	False	s2_output_path
s2_path	None	Absolute path to Sentinel-2 images (THEIA format).	str	False	s2_path
s2_s2c_output_path	None	Absolute path to store preprocessed data in a dedicated directory.	str	False	s2_s2c_output_path
s2_s2c_path	None	Absolute path to Sentinel-2 images (Sen2Cor format).	str	False	s2_s2c_path
spatial_resolution	[]	Output spatial resolution.	list or scalar	False	spatial_resolution

Notes

check_inputs 

Enable the inputs verification. It can take a lot of time for large datasets. Check if region intersects reference data for instance.

cloud_threshold 

Indicates the threshold for a polygon to be used for learning. It uses the validity count, which is incremented if a cloud, a cloud shadow or a saturated pixel is detected.

data_field 

All the labels values must be different from 0.

It is recommended to use a continuous range of values but it is not mandatory.

Keep in mind that the final product type is detected according to the maximum label value.

Try to keep values between 1 and 255 to avoid heavy products.

output_path 

Absolute path to the output directory. It is recommended to have one directory per run of the chain.

region_field 

This column in the database must contain a string which can be converted into integers. For instance ‘1_2’ does not match this condition.

It is mandatory that the region identifiers are > 0.

remove_output_path 

Before the launch of iota2, remove the content of output_path. Only if the first_step is init and the folder name is valid.

spatial_resolution 

The spatial resolution expected. It can be provided as integer or float, or as a list containing two values for non squared resolution.

external_features

Name	Default Value	Description	Type	Mandatory	Name
concat_mode	True	Enable the use of all features.	bool	False	concat_mode
exogeneous_data	None	Path to a Geotiff file containing additional data to be used in external features.	str	False	exogeneous_data
external_features_flag	False	Enable the external features mode.	bool	False	external_features_flag
functions	None	Functions list to be used to compute features.	str/list	False	functions
module	/path/to/iota2/sources	Absolute path for user source code.	str	False	module
no_data_value	-10000	Value considered as no_data in features map mosaic (‘I2FeaturesMap’ builder name).	int	False	no_data_value
output_name	None	Temporary chunks are written using this name as prefix.	str	False	output_name

Notes

concat_mode 

If disabled, only external features are used in the whole processing.

exogeneous_data 

If the =exogeneous_data= contains ‘$TILE’, it will be replaced by the tile name being processed. If you want to reproject your data on given tiles, you can use the =split_raster_into_tiles.py= command line tool.

Usage: =split_raster_into_tiles.py –help=.

functions 

Can be a string of space-separated function names. Can be a list of either strings of function name or lists of one function name and one argument mapping.

iota2_feature_extraction

Name	Default Value	Description	Type	Mandatory	Name
acor_feat	False	Apply atmospherically corrected features	bool	False	acor_feat
copy_input	True	Use spectral bands as features.	bool	False	copy_input
extract_bands	False		bool	False	extract_bands
keep_duplicates	True	Using ‘rel_refl’ can generate duplicated features (ie: NDVI). Set to False remove these duplicated features.	bool	False	keep_duplicates
rel_refl	False	Compute relative reflectances by the red band.	bool	False	rel_refl

Notes

acor_feat 

Apply atmospherically corrected features as explained at : http://www.cesbio.ups-tlse.fr/multitemp/?p=12746.

multi_run_fusion

Name	Default Value	Description	Type	Mandatory	Name
fusionof_all_samples_validation	False	Enable the use of all reference data to evaluate the fusion raster.	bool	False	fusionof_all_samples_validation
keep_runs_results	True		bool	False	keep_runs_results
merge_run	False	Enable the fusion of regression mode, merging all run in a unique result.	bool	False	merge_run
merge_run_method	mean	Indicate the fusion of regression method: ‘mean’ or ‘median’.	str	False	merge_run_method
merge_run_ratio	0.1	Percentage of samples to use in order to evaluate the fusion raster.	float	False	merge_run_ratio

Notes

fusionof_all_samples_validation 

If the fusion mode is enabled, enable the use of all reference data samples for validation.

keep_runs_results 

merge_run_method 

In addition to the regression fusion map, a confidence map is also produced. If the merging method is the mean, then the method used to calculate the confidence map will be the standard deviation. If the median is chosen to merge the maps from the different runs, then the method used to calculate the confidence map will be th median absolute deviation.

pretrained_model

Name	Default Value	Description	Type	Mandatory	Name
boundary_buffer	None	List of boundary buffer size	list	False	boundary_buffer
function	None	Predict function name.	str	False	function
mode	None	Algorithm nature (classification or regression).	str	False	mode
model	None	Serialized object containing the model.	str	False	model
module	/path/to/iota2/sources	Absolute path to the python module.	str	False	module

Notes

function 

This function must have the imposed signature. It does not accept any others parameters. All model dedicated parameters must be stored alongside the model.

mode 

The python module must contains the predict function. It must handle all the potential dependencies and imports related to the correct model instantiation.

model 

In the configuration file, the mandatory keys $REGION and $SEED must be present as they are replaced by iota2. In case of only one region, the region value is set to 1. Look at the documentation about the model constraints.

module 

The python module must contain the predict function. It must handle all the potential dependencies and imports related to the correct model instantiation.

python_data_managing

Name	Default Value	Description	Type	Mandatory	Name
chunk_size_mode	split_number	The chunk split mode, currently the choice is ‘split_number’.	str	False	chunk_size_mode
chunk_size_x	50	Number of columns for one chunk.	int	False	chunk_size_x
chunk_size_y	50	Number of rows for one chunk.	int	False	chunk_size_y
data_mode_access	gapfilled	Choose which data can be accessed in custom features.	str	False	data_mode_access
fill_missing_dates	False	Fill raw data with no data if dates are missing.	bool	False	fill_missing_dates
max_nn_inference_size	None	Maximum batch inference size.	int	False	max_nn_inference_size
number_of_chunks	50	The expected number of chunks.	int	False	number_of_chunks
padding_size_x	0	The padding for chunk.	int	False	padding_size_x
padding_size_y	0	The padding for chunk.	int	False	padding_size_y

Notes

data_mode_access 

Three values are allowed:

gapfilled: give access only the gapfilled data.

raw: gives access only the original raw data.

both: provides access to both data.

Note

Data are spatially resampled, these parameters concern only temporal interpolation.

fill_missing_dates 

If raw data access is enabled, this option considers all unique dates for all tiles and identifies which dates are missing for each tile. A missing date is filled using a no data constant value. Cloud or saturation are not corrected, but masks are provided. Masks contain three values: 0 for valid data, 1 for cloudy or saturated pixels, 2 for a missing date.

max_nn_inference_size 

Involved if a neural network inference is performed. If not set (None), the inference size will be the same as the one used during the learning stage.

scikit_models_parameters

Name	Default Value	Description	Type	Mandatory	Name
cross_validation_folds	5	The number of k-folds.	int	False	cross_validation_folds
cross_validation_grouped	False		bool	False	cross_validation_grouped
keyword_arguments	{}	Keyword arguments to be passed to model.	dict	False	keyword_arguments
model_type	None	Machine learning algorithm’s name.	str	False	model_type
standardization	True		bool	False	standardization

Notes

keyword_arguments 

Keyword arguments to be passed to model.

model_type 

Models coming from scikit-learn are used if scikit_models_parameters.model_type is different from None. More information about how to use scikit-learn is available at iota2 and scikit-learn documentation.

sensors_data_interpolation

Name	Default Value	Description	Type	Mandatory	Name
auto_date	True	Enable the use of start_date and end_date	bool	False	auto_date
use_gapfilling	True	Enable the use of gapfilling (clouds/temporal interpolation).	bool	False	use_gapfilling
write_outputs	False	Write temporary files.	bool	False	write_outputs

Notes

auto_date 

If True, iota2 will automatically guess the first and the last interpolation date. Else, start_date and end_date of each sensors will be used.

write_outputs 

Write the time series before and after gapfilling, the mask time series and also the feature time series. This option required a large amount of free disk space.

task_retry_limits

Name	Default Value	Description	Type	Mandatory	Name
allowed_retry	0	Allow dask to retry a failed job N times.	int	False	allowed_retry
maximum_cpu	4	The maximum number of CPU available.	int	False	maximum_cpu
maximum_ram	16.0	The maximum amount of RAM available (gB).	float	False	maximum_ram

Notes

maximum_cpu 

The amount of cpu will be doubled if the task is killed due to ram overconsumption: until maximum_cpu or allowed_retry are reached.

maximum_ram 

The amount of RAM will be doubled if the task is killed due to ram overconsumption until maximum_ram or allowed_retry are reached.