MSE-CNN Implementation 1
Code database with the implementation of MSE-CNN, from the paper 'DeepQTMT: A Deep Learning Approach for Fast QTMT-based CU Partition of Intra-mode VVC'
Loading...
Searching...
No Matches
Classes | Functions
msecnn_raulkviana.dataset_utils Namespace Reference

Classes

class  VideoCaptureYUV
 
class  VideoCaptureYUVV2
 

Functions

 yuv2bgr (matrix)
 Converts yuv matrix to bgr matrix.
 
 bgr2yuv (matrix)
 Converts BGR matrix to YUV matrix.
 
 extract_content (f)
 Extract a single record from binary file.
 
 file_stats (path)
 Finds out the size of the binary file and computes the number of records.
 
 show_bin_content (path, num_records=100)
 Show contents of a binary file containing encoding information.
 
 add_best_split (labels)
 Modifies labels by adding an extra parameter.
 
 read_from_records (path, num_records)
 Read the information/file generated by the encoder Dictionary containing all the info about the file: It's a dictionary of picture numbers, which then leads to a dictionary of the info.
 
 process_info (content)
 Process the raw data from the labels given by the encoder.
 
 match_cu (CU, CTU, position, size)
 Verifies if the CUs are the same based in their position, size and other information.
 
 find_cu (df_cu, CTU, position, size)
 Verifies if the CU is in the dataframe, using the size and other information.
 
 build_entry (stg1=[], stg2=[], stg3=[], stg4=[], stg5=[], stg6=[])
 Builds a entry with all information needed for each stage, and also removes unnecessary info.
 
 add_cu_to_dict (cu_dict, cu)
 Adds information of a specific CU to the dictionary.
 
 transform_create_struct_faster_v2_mod_divs (f, f_name, num_records, output_dir, n_output_file, color_ch=0)
 First obtains all CTUs and CUs in the file using a dictionary/dataframe, afterward organizes them in a stage oriented way.
 
 transform_create_struct_faster_v3 (f, f_name, num_records, output_dir, n_output_file, color_ch=0)
 First obtains all CTUs and CUs in the file using a dictionary/dataframe, afterward organizes them in a stage oriented way.
 
 process_ctus_cus (df_ctus, df_cus)
 Function to create data structures to organize the CTUs and CUs.
 
 split (size, pos, split_mode)
 Split a CU in one of the specific modes (quad tree, binary vert tree, binary horz tree, threenary vert tree, etc)
 
 transform_raw_dataset (dic)
 Transform raw dataset (dictionary with information of all datasets) and convert it to a list of dictionaries.
 
 get_files_from_folder (path, endswith=".yuv")
 This function obtains the name of all .yuv files in a given path.
 
 get_num_frames (path, name, width, height)
 Get number of frames in yuv file.
 
 get_file_metadata_info (path, name)
 Retrieves information about the YUV file info (framerate, width and height and number of frames)
 
 get_file_metadata_info_mod (name)
 Retrieves information about the YUV file info (framerate, width and height ).
 
 encode_dataset (d_path="C:\\Users\\Raul\\Dropbox\\Dataset", e_path="C:\\Users\\Raul\\Documents\\GitHub\\CPIV\\VTM-7.0_Data\\bin\\vs16\\msvc-19.24\\x86_64\\release", ts=1, QP=32)
 This function encodes the entire dataset with in a given path.
 
 compute_split_per_depth (d_path)
 Compute the percentage and number of splits per depth of the partitiooning scheme.
 
 compute_split_per_depth_v2 (d_path)
 Compute the percentage and number of splits per depth of the partitiooning scheme.
 
 compute_split_per_depth_v3 (d_path)
 Compute the percentage and number of splits per depth of the partitiooning scheme.
 
 lst2csv (lst, name_of_file)
 Converts list of dictionaries to csv file.
 
 get_some_data_equaly (X, path_dir_l, classes, split_pos)
 Gets X amount of data from files.
 
 lst2csv_v2 (lst_lst, n_file, n_fields)
 Converts list to csv file using panda dataframe.
 
 csv2lst (csv_file)
 Reads csv file.
 
 file2lst (file)
 Reads file.
 
 lst2file (lst, name_of_file)
 Converts list of dictionaries to file.
 
 unite_labels_v6 (dir_path_l, n_output_file="labels_pickle", color_ch=0)
 Unites all the labels into a giant list.
 
 unite_labels_v6_mod (dir_path_l, n_output_file="labels_pickle", color_ch=0)
 Unites all the labels into a giant list.
 
 create_dir (output_dir)
 Creates a directory.
 
 labels_with_specific_cch (dir_path, cch=0)
 Obtain from a group of labels in a pickle file the CUs which the color channel is 'cch'.
 
 read_from_records_v2 (f, f_name, num_records)
 Read the information/file generated by the encoder.
 
 file_stats_v2 (path)
 Finds out the size of all binary files, computes the total amount of records, computes the amount of each CU.
 
 compute_split_proportions (path, num_cus=float('inf'))
 Compute the proportion of each split in the dataset.
 
 compute_split_proportions_with_custom_data (custom_dataset, stage, num_cus=float('inf'))
 Compute the proportion of each split in the dataset (Custom dataset classs)
 
 compute_split_proportions_with_custom_data_multi (custom_dataset, split_pos_in_struct, num_cus=float('inf'))
 Compute the proportion of each split in the dataset (Custom dataset classs)
 
 compute_split_proportions_with_path_multi_new (path, split_pos_in_struct, num_cus=float('inf'))
 Compute the proportion of each split in the dataset (Custom dataset classs)
 
 compute_split_proportions_with_custom_data_multi_new (custom_dataset, split_pos_in_struct, num_cus=float('inf'))
 Compute the proportion of each split in the dataset (Custom dataset classs)
 
 compute_split_proportions_labels (path, num_cus=float('inf'))
 Compute the proportion of each split in the dataset.
 
 balance_dataset (dir_path, stg, n_classes=6)
 Balance dataset so that the number of the classes are the same.
 
 balance_dataset_JF (dir_path, n_classes=6)
 Balance dataset so that the number of the classes are the same.
 
 balance_dataset_down (dir_path, n_classes=6)
 Balance dataset so that the number of the classes are the same.
 
 balance_dataset_down_v2 (dir_path)
 Balance dataset so that the number of the classes are the same.
 
 balance_dataset_down_v3 (dir_path)
 Balance dataset so that the number of the classes are the same.
 
 balance_dataset_down_v4 (dir_path)
 Balance dataset so that the number of the classes are the same.
 
 balance_dataset_up (dir_path, n_classes=6)
 Balance dataset so that the number of the classes are the same.
 
 balance_dataset_up_v2 (dir_path)
 Balance dataset so that the number of the classes are the same.
 
 balance_dataset_up_v3 (dir_path)
 Balance dataset so that the number of the classes are the same.
 
 gen_dataset_types (d_path, valid_percent)
 Generate a dataset for trainign, validating and testing.
 
 change_struct_64x64_eval (path_dir_l)
 This version is meant to be used in to process the stage 1 and 2 data.
 
 change_struct_32x32_eval (path_dir_l)
 This version is meant to be used in to process the stage 3 data.
 
 change_struct_64x64 (path_dir_l)
 This version is meant to be used in to process the stage 1 and 2 data.
 
 change_struct_64x64_no_dupl_v3 (path_dir_l)
 This version is like the change_struct_64x64_no_dupl_v2, with threads.
 
 mod_64x64_threads (f, path_dir_l, right_rows, columns, new_dir)
 
 change_struct_64x64_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2.
 
 change_struct_32x32 (path_dir_l)
 This version is meant to be used in to process the stage 3 data.
 
 change_struct_32x32_no_dupl (path_dir_l)
 This version is like the change_struct_32x32, but it removes possible duplicated rows.
 
 change_struct_32x32_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is smarter.
 
 change_struct_32x32_no_dupl_v3 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but uses threads.
 
 mod_32x32_threads (f, path_dir_l, right_rows, columns, new_dir)
 
 change_struct_32x32_no_dupl_v2_test (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but is for verifying if everything is right.
 
 change_struct_16x16_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 16x16 CUs.
 
 list2tuple (l)
 
 tuple2list (l)
 
 change_struct_8x8_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 16x16 CUs.
 
 change_struct_no_dupl_stg6_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 6.
 
 change_struct_no_dupl_stg5_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 5.
 
 change_struct_no_dupl_stg2_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 2.
 
 change_struct_no_dupl_stg4_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 4.
 
 change_struct_no_dupl_stg3_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 3.
 
 change_struct_32x16_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 32x16 CUs.
 
 change_struct_32x8_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 32x8 CUs.
 
 change_struct_16x8_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 16x8 CUs.
 
 change_struct_8x4_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 8x4 CUs.
 
 change_struct_32x4_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 32x4 CUs.
 
 change_struct_16x4_no_dupl_v2 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 8x4 CUs.
 
 change_struct_16x16_no_dupl_v3 (path_dir_l)
 This version is like the change_struct_16x16_no_dupl_v2, but uses threads.
 
 mod_16x16_threads (f, path_dir_l, right_rows, columns, new_dir)
 
 change_struct_16x16 (path_dir_l)
 This version is meant to be used in to process the stage 4 data.
 
 change_struct_no_dupl_stg_4_complexity_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 4.
 
 change_struct_no_dupl_stg_3_complexity_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 3.
 
 change_struct_no_dupl_stg_2_complexity_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 2.
 
 change_struct_no_dupl_stg_6_complexity_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 6.
 
 change_struct_no_dupl_stg_5_complexity_v4 (path_dir_l)
 This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 5.
 

Detailed Description

@package docstring 

@file dataset_utils.py 

@brief Usefull functions to manipulate data, change and create structures 
 
@section libraries_dataset_utils Libraries 
- os
- utils
- pandas
- torch
- csv
- struct
- numpy
- sklearn.model_selection
- cv2
- threading
- pickle
- shutil
- sys
- time
- math
- re

@section classes_dataset_utils Classes 
- VideoCaptureYUV
 
@section functions_dataset_utils Functions
- extract_content(f)
- file_stats(path)
- show_bin_content(path, num_records=100)
- add_best_split(labels)
- read_from_records(path, num_records)
- process_info(content)
- match_cu(CU, CTU, position, size)
- find_cu(df_cu, CTU, position, size)
- build_entry(stg1=[], stg2=[], stg3=[], stg4=[], stg5=[], stg6=[])
- add_cu_to_dict(cu_dict, cu)
- transform_create_struct_faster_v2_mod_divs(f, f_name, num_records, output_dir, n_output_file, color_ch=0)
- transform_create_struct_faster_v3(f, f_name, num_records, output_dir, n_output_file, color_ch=0)
- process_ctus_cus(df_ctus, df_cus)
- split(size, pos, split_mode)
- transform_raw_dataset(dic)
- get_files_from_folder(path, endswith=".yuv")
- get_num_frames(path, name, width, height)
- get_file_metadata_info(path, name)
- get_file_metadata_info_mod(name)
- encode_dataset
- compute_split_per_depth(d_path)
- compute_split_per_depth_v2(d_path)
- compute_split_per_depth_v3(d_path)
- lst2csv(lst, name_of_file)
- get_some_data_equaly(X, path_dir_l, classes, split_pos)
- lst2csv_v2(lst_lst, n_file, n_fields)
- csv2lst(csv_file)
- file2lst(file)
- lst2file(lst, name_of_file)
- unite_labels_v6(dir_path_l, n_output_file="labels_pickle", color_ch=0)
- unite_labels_v6_mod(dir_path_l, n_output_file="labels_pickle", color_ch=0)
- create_dir(output_dir)
- labels_with_specific_cch(dir_path, cch=0)
- read_from_records_v2(f, f_name, num_records)
- file_stats_v2(path)
- compute_split_proportions(path, num_cus=float('inf'))
- compute_split_proportions_with_custom_data(custom_dataset, stage, num_cus=float('inf'))
- compute_split_proportions_with_custom_data_multi(custom_dataset, split_pos_in_struct, num_cus=float('inf'))
- compute_split_proportions_with_path_multi_new(path, split_pos_in_struct, num_cus=float('inf'))
- compute_split_proportions_with_custom_data_multi_new(custom_dataset, split_pos_in_struct, num_cus=float('inf'))
- compute_split_proportions_labels(path, num_cus=float('inf'))
- balance_dataset(dir_path, stg, n_classes=6)
- balance_dataset_JF(dir_path, n_classes=6)
- balance_dataset_down(dir_path, n_classes=6)
- balance_dataset_down_v2(dir_path)
- balance_dataset_down_v3(dir_path)
- balance_dataset_down_v4(dir_path)
- balance_dataset_up(dir_path, n_classes=6)
- balance_dataset_up_v2(dir_path)
- balance_dataset_up_v3(dir_path)
- gen_dataset_types(d_path, valid_percent)
- change_struct_64x64_eval(path_dir_l)
- change_struct_32x32_eval(path_dir_l)
- change_struct_64x64(path_dir_l)
- change_struct_64x64_no_dupl_v3(path_dir_l)
- mod_64x64_threads(f, path_dir_l, right_rows, columns, new_dir)
- change_struct_64x64_no_dupl_v2(path_dir_l)
- change_struct_32x32(path_dir_l)
- change_struct_32x32_no_dupl(path_dir_l)
- change_struct_32x32_no_dupl_v2(path_dir_l)
- change_struct_32x32_no_dupl_v3(path_dir_l)
- mod_32x32_threads(f, path_dir_l, right_rows, columns, new_dir)
- change_struct_32x32_no_dupl_v2_test(path_dir_l)
- change_struct_16x16_no_dupl_v2(path_dir_l)
- list2tuple(l)
- tuple2list(l)
- change_struct_8x8_no_dupl_v2(path_dir_l)
- change_struct_no_dupl_stg6_v4(path_dir_l)
- change_struct_no_dupl_stg5_v4(path_dir_l)
- change_struct_no_dupl_stg2_v4(path_dir_l)
- change_struct_no_dupl_stg4_v4(path_dir_l)
- change_struct_no_dupl_stg3_v4(path_dir_l)
- change_struct_32x16_no_dupl_v2(path_dir_l)
- change_struct_32x8_no_dupl_v2(path_dir_l)
- change_struct_16x8_no_dupl_v2(path_dir_l)
- change_struct_8x4_no_dupl_v2(path_dir_l)
- change_struct_32x4_no_dupl_v2(path_dir_l)
- change_struct_16x4_no_dupl_v2(path_dir_l)
- change_struct_16x16_no_dupl_v3(path_dir_l)
- mod_16x16_threads(f, path_dir_l, right_rows, columns, new_dir)
- change_struct_16x16(path_dir_l)
- change_struct_no_dupl_stg_4_complexity_v4(path_dir_l)
- change_struct_no_dupl_stg_3_complexity_v4(path_dir_l)
- change_struct_no_dupl_stg_2_complexity_v4(path_dir_l)
- change_struct_no_dupl_stg_6_complexity_v4(path_dir_l)
- change_struct_no_dupl_stg_5_complexity_v4(path_dir_l)
 
@section global_vars_dataset_utils Global Variables 
- None 

@section todo_dataset_utils TODO 
- None 

@section license License 
MIT License 
Copyright (c) 2022 Raul Kevin do Espirito Santo Viana
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

@section author_dataset_utils Author(s)
- Created by Raul Kevin Viana
- Last time modified is 2023-01-29 22:22:04.120175

Function Documentation

◆ add_best_split()

msecnn_raulkviana.dataset_utils.add_best_split (   labels)

Modifies labels by adding an extra parameter.

Dictionary containing all the info about the file: It's a dictionary of picture numbers, which then leads to a dictionary of the info. For example: records = {"Pic_0" :{"CU_0": {"colorChannel": 1, "CULoc_left": 2, ... "split": 5 } ... ... } }

Parameters
[in]labelsDictionary with the labels of the dataset
[out]new_labelsNew dictionary with the lables of the dataset

◆ add_cu_to_dict()

msecnn_raulkviana.dataset_utils.add_cu_to_dict (   cu_dict,
  cu 
)

Adds information of a specific CU to the dictionary.

Parameters
[in]cu_dictDictionary with information about all CUs
[in]cuCU information to add to the dictionary
[out]cu_dictDictionary with information about all CUs, with a new cu added

◆ balance_dataset()

msecnn_raulkviana.dataset_utils.balance_dataset (   dir_path,
  stg,
  n_classes = 6 
)

Balance dataset so that the number of the classes are the same.

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]stgStage number
[in]n_classesNumber of classes to try to balance

◆ balance_dataset_down()

msecnn_raulkviana.dataset_utils.balance_dataset_down (   dir_path,
  n_classes = 6 
)

Balance dataset so that the number of the classes are the same.

Uses downsampling. Different strategy that of the balance_dataset function.

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]n_classesNumber of classes to try to balance

◆ balance_dataset_down_v2()

msecnn_raulkviana.dataset_utils.balance_dataset_down_v2 (   dir_path)

Balance dataset so that the number of the classes are the same.

Uses downsampling. Different strategy that of the balance_dataset function. Faster version

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]n_classesNumber of classes to try to balance

◆ balance_dataset_down_v3()

msecnn_raulkviana.dataset_utils.balance_dataset_down_v3 (   dir_path)

Balance dataset so that the number of the classes are the same.

Uses downsampling. Different strategy that of the balance_dataset function. Faster version

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]n_classesNumber of classes to try to balance

◆ balance_dataset_down_v4()

msecnn_raulkviana.dataset_utils.balance_dataset_down_v4 (   dir_path)

Balance dataset so that the number of the classes are the same.

Uses downsampling. Different strategy that of the balance_dataset function. Faster version. No dicts version

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]n_classesNumber of classes to try to balance

◆ balance_dataset_JF()

msecnn_raulkviana.dataset_utils.balance_dataset_JF (   dir_path,
  n_classes = 6 
)

Balance dataset so that the number of the classes are the same.

Uses upsampling. Follows same strategy as the balance dataset function.

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]n_classesNumber of classes to try to balance

◆ balance_dataset_up()

msecnn_raulkviana.dataset_utils.balance_dataset_up (   dir_path,
  n_classes = 6 
)

Balance dataset so that the number of the classes are the same.

Uses upsampling. Different strategy that of the balance_dataset function.

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]n_classesNumber of classes to try to balance

◆ balance_dataset_up_v2()

msecnn_raulkviana.dataset_utils.balance_dataset_up_v2 (   dir_path)

Balance dataset so that the number of the classes are the same.

Uses upsampling. Different strategy that of the balance_dataset function. Faster version

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]n_classesNumber of classes to try to balance

◆ balance_dataset_up_v3()

msecnn_raulkviana.dataset_utils.balance_dataset_up_v3 (   dir_path)

Balance dataset so that the number of the classes are the same.

Uses upsampling. Different strategy that of the balance_dataset function. Faster version

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]n_classesNumber of classes to try to balance

◆ bgr2yuv()

msecnn_raulkviana.dataset_utils.bgr2yuv (   matrix)

Converts BGR matrix to YUV matrix.

Parameters
[in]matrixBGR matrix
[out]YUVYUV conversion

◆ build_entry()

msecnn_raulkviana.dataset_utils.build_entry (   stg1 = [],
  stg2 = [],
  stg3 = [],
  stg4 = [],
  stg5 = [],
  stg6 = [] 
)

Builds a entry with all information needed for each stage, and also removes unnecessary info.

Parameters
[in]stg1CU (dict with information about the CU) for stage 1
[in]stg2CU (dict with information about the CU) for stage 2
[in]stg3CU (dict with information about the CU) for stage 3
[in]stg4CU (dict with information about the CU) for stage 4
[in]stg5CU (dict with information about the CU) for stage 5
[in]stg6CU (dict with information about the CU) for stage 6
[out]entryDictionary with information about the all stages inputs

◆ change_struct_16x16()

msecnn_raulkviana.dataset_utils.change_struct_16x16 (   path_dir_l)

This version is meant to be used in to process the stage 4 data.

◆ change_struct_16x16_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_16x16_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 16x16 CUs.

◆ change_struct_16x16_no_dupl_v3()

msecnn_raulkviana.dataset_utils.change_struct_16x16_no_dupl_v3 (   path_dir_l)

This version is like the change_struct_16x16_no_dupl_v2, but uses threads.

◆ change_struct_16x4_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_16x4_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 8x4 CUs.

◆ change_struct_16x8_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_16x8_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 16x8 CUs.

◆ change_struct_32x16_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_32x16_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 32x16 CUs.

◆ change_struct_32x32()

msecnn_raulkviana.dataset_utils.change_struct_32x32 (   path_dir_l)

This version is meant to be used in to process the stage 3 data.

◆ change_struct_32x32_eval()

msecnn_raulkviana.dataset_utils.change_struct_32x32_eval (   path_dir_l)

This version is meant to be used in to process the stage 3 data.

◆ change_struct_32x32_no_dupl()

msecnn_raulkviana.dataset_utils.change_struct_32x32_no_dupl (   path_dir_l)

This version is like the change_struct_32x32, but it removes possible duplicated rows.

◆ change_struct_32x32_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_32x32_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is smarter.

◆ change_struct_32x32_no_dupl_v2_test()

msecnn_raulkviana.dataset_utils.change_struct_32x32_no_dupl_v2_test (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but is for verifying if everything is right.

◆ change_struct_32x32_no_dupl_v3()

msecnn_raulkviana.dataset_utils.change_struct_32x32_no_dupl_v3 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but uses threads.

◆ change_struct_32x4_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_32x4_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 32x4 CUs.

◆ change_struct_32x8_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_32x8_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 32x8 CUs.

◆ change_struct_64x64()

msecnn_raulkviana.dataset_utils.change_struct_64x64 (   path_dir_l)

This version is meant to be used in to process the stage 1 and 2 data.

◆ change_struct_64x64_eval()

msecnn_raulkviana.dataset_utils.change_struct_64x64_eval (   path_dir_l)

This version is meant to be used in to process the stage 1 and 2 data.

◆ change_struct_64x64_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_64x64_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2.

◆ change_struct_64x64_no_dupl_v3()

msecnn_raulkviana.dataset_utils.change_struct_64x64_no_dupl_v3 (   path_dir_l)

This version is like the change_struct_64x64_no_dupl_v2, with threads.

◆ change_struct_8x4_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_8x4_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 8x4 CUs.

◆ change_struct_8x8_no_dupl_v2()

msecnn_raulkviana.dataset_utils.change_struct_8x8_no_dupl_v2 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to 16x16 CUs.

◆ change_struct_no_dupl_stg2_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg2_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 2.

◆ change_struct_no_dupl_stg3_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg3_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 3.

◆ change_struct_no_dupl_stg4_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg4_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 4.

◆ change_struct_no_dupl_stg5_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg5_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 5.

◆ change_struct_no_dupl_stg6_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg6_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stage 6.

◆ change_struct_no_dupl_stg_2_complexity_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg_2_complexity_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 2.

Here it is going to be obtained data to be used for the complexity assesment

◆ change_struct_no_dupl_stg_3_complexity_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg_3_complexity_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 3.

Here it is going to be obtained data to be used for the complexity assesment

◆ change_struct_no_dupl_stg_4_complexity_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg_4_complexity_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 4.

Here it is going to be obtained data to be used for the complexity assesment

◆ change_struct_no_dupl_stg_5_complexity_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg_5_complexity_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 5.

Here it is going to be obtained data to be used for the complexity assesment

◆ change_struct_no_dupl_stg_6_complexity_v4()

msecnn_raulkviana.dataset_utils.change_struct_no_dupl_stg_6_complexity_v4 (   path_dir_l)

This version is like the change_struct_32x32_no_dupl_v2, but it is applied to stages 6.

Here it is going to be obtained data to be used for the complexity assesment

◆ compute_split_per_depth()

msecnn_raulkviana.dataset_utils.compute_split_per_depth (   d_path)

Compute the percentage and number of splits per depth of the partitiooning scheme.

Parameters
[in]d_pathPath with the files containing with the cus sequences
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ compute_split_per_depth_v2()

msecnn_raulkviana.dataset_utils.compute_split_per_depth_v2 (   d_path)

Compute the percentage and number of splits per depth of the partitiooning scheme.

This version uses just dataframe

Parameters
[in]d_pathPath with the files containing with the cus sequences
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ compute_split_per_depth_v3()

msecnn_raulkviana.dataset_utils.compute_split_per_depth_v3 (   d_path)

Compute the percentage and number of splits per depth of the partitiooning scheme.

This version uses just list comprehension

Parameters
[in]d_pathPath with the files containing with the cus sequences
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ compute_split_proportions()

msecnn_raulkviana.dataset_utils.compute_split_proportions (   path,
  num_cus = float('inf') 
)

Compute the proportion of each split in the dataset.

Parameters
[in]pathPath where the encoded data is located
[in]num_cusNumber CUs to count for each file to calculate the proportions
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ compute_split_proportions_labels()

msecnn_raulkviana.dataset_utils.compute_split_proportions_labels (   path,
  num_cus = float('inf') 
)

Compute the proportion of each split in the dataset.

This version receives a path with labels already processed

Parameters
[in]pathPath where the encoded data is located
[in]num_cusNumber CUs to count for each file to calculate the proportions
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ compute_split_proportions_with_custom_data()

msecnn_raulkviana.dataset_utils.compute_split_proportions_with_custom_data (   custom_dataset,
  stage,
  num_cus = float('inf') 
)

Compute the proportion of each split in the dataset (Custom dataset classs)

Parameters
[in]custom_datasetObject with custom dataset
[in]stageStage number that the proportions will be computed
[in]num_cusNumber CUs to count to calculate the proportions
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ compute_split_proportions_with_custom_data_multi()

msecnn_raulkviana.dataset_utils.compute_split_proportions_with_custom_data_multi (   custom_dataset,
  split_pos_in_struct,
  num_cus = float('inf') 
)

Compute the proportion of each split in the dataset (Custom dataset classs)

Parameters
[in]custom_datasetObject with custom dataset
[in]stageStage number that the proportions will be computed
[in]split_pos_in_structPosition in dataset with the split information
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ compute_split_proportions_with_custom_data_multi_new()

msecnn_raulkviana.dataset_utils.compute_split_proportions_with_custom_data_multi_new (   custom_dataset,
  split_pos_in_struct,
  num_cus = float('inf') 
)

Compute the proportion of each split in the dataset (Custom dataset classs)

Parameters
[in]custom_datasetObject with custom dataset
[in]stageStage number that the proportions will be computed
[in]split_pos_in_structPosition in dataset with the split information
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ compute_split_proportions_with_path_multi_new()

msecnn_raulkviana.dataset_utils.compute_split_proportions_with_path_multi_new (   path,
  split_pos_in_struct,
  num_cus = float('inf') 
)

Compute the proportion of each split in the dataset (Custom dataset classs)

Parameters
[in]path
[in]stageStage number that the proportions will be computed
[in]split_pos_in_structPosition in dataset with the split information
[out]pmDictionary with the proportion of each split. {0: 0.1, 1:0.01, ... , 5:0.3}
[out]amDictionary with the amount of each split. {0: 10, 1:1, ... , 5:30}

◆ create_dir()

msecnn_raulkviana.dataset_utils.create_dir (   output_dir)

Creates a directory.

If the directory already exists, it will be deleted

Parameters
[in]output_dirName of the directory

◆ csv2lst()

msecnn_raulkviana.dataset_utils.csv2lst (   csv_file)

Reads csv file.

Parameters
[in]csv_filePath with the csv file
[out]lstList of dictionaries with the labels from the csv file

◆ encode_dataset()

msecnn_raulkviana.dataset_utils.encode_dataset (   d_path = "C:\\Users\\Raul\\Dropbox\\Dataset",
  e_path = "C:\\Users\\Raul\\Documents\\GitHub\\CPIV\\VTM-7.0_Data\\bin\\vs16\\msvc-19.24\\x86_64\\release",
  ts = 1,
  QP = 32 
)

This function encodes the entire dataset with in a given path.

Parameters
[in]d_pathPath containing the dataset with the files to encode (this path can not contain spaces)
[in]e_pathPath containing the encoder and configurations for it
[in]tsTemporal Subsample Ratio (ts é o parametro que controla a quantidade de frames)
[in]QPQuantization parameter

◆ extract_content()

msecnn_raulkviana.dataset_utils.extract_content (   f)

Extract a single record from binary file.

Parameters
[in]fFile object
[out]contentDictionary containing the information of a single record

◆ file2lst()

msecnn_raulkviana.dataset_utils.file2lst (   file)

Reads file.

Parameters
[in]filePath with the txt file
[out]lstList of dictionaries with the labels from a pickle file

◆ file_stats()

msecnn_raulkviana.dataset_utils.file_stats (   path)

Finds out the size of the binary file and computes the number of records.

Parameters
[in]pathPath where the binary file is located
[out]num_recordsNumber of records that the binary file contains
[out]file_sizeSize of the binary file

◆ file_stats_v2()

msecnn_raulkviana.dataset_utils.file_stats_v2 (   path)

Finds out the size of all binary files, computes the total amount of records, computes the amount of each CU.

Parameters
[in]pathPath where the binary files are located
[out]num_recordsNumber of records that all binary files contains
[out]amount_dicDictionary with the amount of each CU amount_dic = {"file_name": {"128x128L":100, "128x128C":100, ... , "4x4C", "4x4L"}, ..., "file_name2":{...}}, in which C stands for chroma and L for Luma
[out]summary_dicDictionary with the sum of each CU type

◆ find_cu()

msecnn_raulkviana.dataset_utils.find_cu (   df_cu,
  CTU,
  position,
  size 
)

Verifies if the CU is in the dataframe, using the size and other information.

Uses pandas' dataframe

Parameters
[in]df_cuDataframe with all the CUs
[in]CTUOriginal CTU (dict with information about the CTU)
[in]positionPosition of the CU that it is being searched [left, top]
[in]sizePosition of the CU that it is being searched [left, top]
[out]cuEither a CU pandas' series object or a false boolean value that indicates that the CU wasn't found

◆ gen_dataset_types()

msecnn_raulkviana.dataset_utils.gen_dataset_types (   d_path,
  valid_percent 
)

Generate a dataset for trainign, validating and testing.

This is done by concatenating all of the data from a folder and then dividing it in 3 parts

Parameters
[in]d_pathPath with all the labels (.txt files)
[in]valid_percentPercentage of data allocated to test and validation data

◆ get_file_metadata_info()

msecnn_raulkviana.dataset_utils.get_file_metadata_info (   path,
  name 
)

Retrieves information about the YUV file info (framerate, width and height and number of frames)

Parameters
[in]pathPath containing dataset
[in]nameName of the file where the file is located
[out]file_infoDictionary with information about the yuv file (dimensions, frame rate and number of frames) or a boolean value indicating that there is no informations

◆ get_file_metadata_info_mod()

msecnn_raulkviana.dataset_utils.get_file_metadata_info_mod (   name)

Retrieves information about the YUV file info (framerate, width and height ).

This version doesn't compute the number of frames.

Parameters
[in]nameName of the file where the file is located
[out]file_infoDictionary with information about the yuv file (dimensions and frame rate) or a boolean value indicating that there is no informations

◆ get_files_from_folder()

msecnn_raulkviana.dataset_utils.get_files_from_folder (   path,
  endswith = ".yuv" 
)

This function obtains the name of all .yuv files in a given path.

Parameters
[in]pathPath containing the files
[out]files_listList containing all the names of the .yuv and .hif files

◆ get_num_frames()

msecnn_raulkviana.dataset_utils.get_num_frames (   path,
  name,
  width,
  height 
)

Get number of frames in yuv file.

Parameters
[in]pathPath containing dataset
[in]nameName of the file where the file is located
[in]widthWidth of the picture
[in]heightHeight of the picture
[out]num_framesNumber of frames that the file contain

◆ get_some_data_equaly()

msecnn_raulkviana.dataset_utils.get_some_data_equaly (   X,
  path_dir_l,
  classes,
  split_pos 
)

Gets X amount of data from files.

◆ labels_with_specific_cch()

msecnn_raulkviana.dataset_utils.labels_with_specific_cch (   dir_path,
  cch = 0 
)

Obtain from a group of labels in a pickle file the CUs which the color channel is 'cch'.

Parameters
[in]dir_pathPath with all the labels (.txt files)
[in]cchColor Channel

◆ list2tuple()

msecnn_raulkviana.dataset_utils.list2tuple (   l)

◆ lst2csv()

msecnn_raulkviana.dataset_utils.lst2csv (   lst,
  name_of_file 
)

Converts list of dictionaries to csv file.

Parameters
[in]lstList of dictionaries
[in]name_of_fileName to be given to the csv file

◆ lst2csv_v2()

msecnn_raulkviana.dataset_utils.lst2csv_v2 (   lst_lst,
  n_file,
  n_fields 
)

Converts list to csv file using panda dataframe.

Parameters
[in]lstList of lists
[in]n_fileName to be given to the csv file
[in]n_fieldsList of names given to each field

◆ lst2file()

msecnn_raulkviana.dataset_utils.lst2file (   lst,
  name_of_file 
)

Converts list of dictionaries to file.

Parameters
[in]lstList of dictionaries
[in]name_of_fileName to be given to the file

◆ match_cu()

msecnn_raulkviana.dataset_utils.match_cu (   CU,
  CTU,
  position,
  size 
)

Verifies if the CUs are the same based in their position, size and other information.

Parameters
[in]CUCU (dict with information about the CU) that will be inspected
[in]CTUOriginal CTU (dict with information about the CTU)
[in]positionPosition of the CU that it is being searched
[in]sizePosition of the CU that it is being searched
[out]match_or_notBool value with the decision about the matching

◆ mod_16x16_threads()

msecnn_raulkviana.dataset_utils.mod_16x16_threads (   f,
  path_dir_l,
  right_rows,
  columns,
  new_dir 
)

◆ mod_32x32_threads()

msecnn_raulkviana.dataset_utils.mod_32x32_threads (   f,
  path_dir_l,
  right_rows,
  columns,
  new_dir 
)

◆ mod_64x64_threads()

msecnn_raulkviana.dataset_utils.mod_64x64_threads (   f,
  path_dir_l,
  right_rows,
  columns,
  new_dir 
)

◆ process_ctus_cus()

msecnn_raulkviana.dataset_utils.process_ctus_cus (   df_ctus,
  df_cus 
)

Function to create data structures to organize the CTUs and CUs.

TODO: Try to implement this with recursion

Parameters
[in]df_ctusDataframe with CTUs
[in]df_cusDataframe with CUs
[out]structed_cusDictionary containing the all CUs organized in a stage oriented way. Each entry looks like: [f_name_labels, pic_name, RD0, RD1, RD2, RD3, RD4, RD5, pos, size]

◆ process_info()

msecnn_raulkviana.dataset_utils.process_info (   content)

Process the raw data from the labels given by the encoder.

Parameters
[in]contentDict with the information about
[out]contentProcessed dict

◆ read_from_records()

msecnn_raulkviana.dataset_utils.read_from_records (   path,
  num_records 
)

Read the information/file generated by the encoder Dictionary containing all the info about the file: It's a dictionary of picture numbers, which then leads to a dictionary of the info.

For example: records = {"Pic_0" :{"CU_0": {"colorChannel": 1, "CULoc_left": 2, ... } ... ... } }

Parameters
[in]pathPath where the file is located
[in]num_recordsNumber of records to show
[out]recordsDictionary containing the information of all records

◆ read_from_records_v2()

msecnn_raulkviana.dataset_utils.read_from_records_v2 (   f,
  f_name,
  num_records 
)

Read the information/file generated by the encoder.

This version contains the file object. Adapted for the unite_labels_v3 function Dictionary containing all the info about the file: It's a dictionary of picture numbers, which then leads to a dictionary of the info. For example: records = {"Pic_0" :{"CU_0": {"colorChannel": 1, "CULoc_left": 2, ... } ... ... } }

Parameters
[in]fFile object
[in]f_namePath where the file is located
[out]num_recordsDictionary containing the information of all records

◆ show_bin_content()

msecnn_raulkviana.dataset_utils.show_bin_content (   path,
  num_records = 100 
)

Show contents of a binary file containing encoding information.

Parameters
[in]pathPath where the binary file is located
[in]num_recordsNumber of records to show

◆ split()

msecnn_raulkviana.dataset_utils.split (   size,
  pos,
  split_mode 
)

Split a CU in one of the specific modes (quad tree, binary vert tree, binary horz tree, threenary vert tree, etc)

Parameters
[in]sizeSize of the CU (width, height)
[in]posPosition of the CU (width, height)
[out]new_positionsOutput of tuple with the positions of the CUs
[out]new_sizesOutput of tuple with the sizes of the CUs

◆ transform_create_struct_faster_v2_mod_divs()

msecnn_raulkviana.dataset_utils.transform_create_struct_faster_v2_mod_divs (   f,
  f_name,
  num_records,
  output_dir,
  n_output_file,
  color_ch = 0 
)

First obtains all CTUs and CUs in the file using a dictionary/dataframe, afterward organizes them in a stage oriented way.

Removes elements from the cu list to speed up the process. Uses only specified color channel. This versions divides info into multiple files

Parameters
[in]fFile object
[in]f_nameFile name
[in]num_recordsNumber of records
[in]color_chColor channel
[out]structed_cusDictionary containing the all CUs organized in a stage oriented way. Each entry looks like: [f_name_labels, pic_name, RD0, RD1, RD2, RD3, RD4, RD5, pos, size]

◆ transform_create_struct_faster_v3()

msecnn_raulkviana.dataset_utils.transform_create_struct_faster_v3 (   f,
  f_name,
  num_records,
  output_dir,
  n_output_file,
  color_ch = 0 
)

First obtains all CTUs and CUs in the file using a dictionary/dataframe, afterward organizes them in a stage oriented way.

Removes elements from the cu list to speed up the process. Uses only specified color channel. This version its similar to the div version, but outputs only a file

Parameters
[in]fFile object
[in]f_nameFile name
[in]num_recordsNumber of records
[in]color_chColor channel
[out]structed_cusDictionary containing the all CUs organized in a stage oriented way. Each entry looks like: [f_name_labels, pic_name, RD0, RD1, RD2, RD3, RD4, RD5, pos, size]

◆ transform_raw_dataset()

msecnn_raulkviana.dataset_utils.transform_raw_dataset (   dic)

Transform raw dataset (dictionary with information of all datasets) and convert it to a list of dictionaries.

  • List entry: pic_name | color_ch | POC | CU_loc_left | ... | split CU oriented style
Parameters
[in]dicDictionary containing all the raw data
[out]lst_dictsList of dictionaries (entries of the information of each CU)

◆ tuple2list()

msecnn_raulkviana.dataset_utils.tuple2list (   l)

◆ unite_labels_v6()

msecnn_raulkviana.dataset_utils.unite_labels_v6 (   dir_path_l,
  n_output_file = "labels_pickle",
  color_ch = 0 
)

Unites all the labels into a giant list.

This version, follows a stage oriented approach. Uses just the specified color channel

Parameters
[in]dir_path_lPath with all the labels (.dat files)
[in]n_output_fileName for the output file
[in]color_chColor channel

◆ unite_labels_v6_mod()

msecnn_raulkviana.dataset_utils.unite_labels_v6_mod (   dir_path_l,
  n_output_file = "labels_pickle",
  color_ch = 0 
)

Unites all the labels into a giant list.

This version, follows a stage oriented approach. Uses just the specified color channel

Parameters
[in]dir_path_lPath with all the labels (.dat files)
[in]n_output_fileName for the output file
[in]color_chColor channel

◆ yuv2bgr()

msecnn_raulkviana.dataset_utils.yuv2bgr (   matrix)

Converts yuv matrix to bgr matrix.

Parameters
[in]matrixYuv matrix
[out]bgrBgr conversion