Skip to content

Internal Functions

FMDData.states_dict Constant
julia
states_dict

A Dictionary of States/UTs that can appear in the data set. The keys will be returned in the cleaning steps, and the values can be matched in the underlying datasets.

source
FMDData._calculate_state_counts Method
julia
_calculate_state_counts(table, original_df)

An internal function to handle the calculation of the state/serotype counts based upon the provided state/serotype seroprevalence values and total state counts. Because DataFrames handles tables as named tuples, we can extract information about the columns being passed from the regex selection and then use substitution strings to collect a view of the correct column of total state counts.

You probably want to use the user-facing function calculate_state_counts() instead.

source
FMDData._calculate_state_seroprevalence Method
julia
_calculate_state_seroprevalence(table, original_df)

An internal function to handle the calculation of the state/serotype counts based upon the provided state/serotype seroprevalence values and total state counts. Because DataFrames handles tables as named tuples, we can extract information about the columns being passed from the regex selection and then use substitution strings to collect a view of the correct column of total state counts.

You probably want to use the user-facing function calculate_state_seroprevalence() instead.

source
FMDData._calculate_string_occurences Method
julia
_calculate_string_occurences(
    vals::Vector{S},
    unique_vals::Vector{S} = unique(vals)
) where {S <: AbstractString}

Internal function to calculate how many times each unique string value occurs in a vector of strings

source
FMDData._calculate_totals! Method
julia
_calculate_totals!(
    totals_dict::OrderedDict,
    col::Vector{T},
    colname::String,
) where {T <: Union{<:Union{<:Missing, <:Integer}, <:Integer}}

Internal function to calculate the serotype total.

source
FMDData._check_all_required_serotypes Method
julia
_check_all_required_serotypes(
    all_matched_serotypes::T,
    allowed_serotypes::T = default_allowed_serotypes,
) where {T <: AbstractVector{<:AbstractString}}

Internal function to check that all required serotypes provided in the data.

source
FMDData._check_identical_column_names Method
julia
_check_identical_column_names(df::DataFrame)

Check if the provided data has any duplicate column names.

Should be run BEFORE _check_similar_column_names() as push!() call in _check_similar_column_names will overwrite previous Dict entry key (of similar column names) if there are exact matches.

source
FMDData._check_no_disallowed_serotypes Method
julia
_check_no_disallowed_serotypes(
    all_matched_serotypes::T,
    allowed_serotypes::T = default_allowed_serotypes,
) where {T <: AbstractVector{<:AbstractString}}

Internal function to check that there are no disallowed serotypes provided in the data.

source
FMDData._check_similar_column_names Method
julia
_check_similar_column_names(df::DataFrame)

Check if any columns have similar names. Calculates if any column names are substrings of other columns names.

Should be run AFTER _check_identical_column_names() as push!() call will overwrite previous Dict entry key if there are exact matches.

source
FMDData._collect_totals_check_args Method
julia
_collect_totals_check_args(
    col::Vector{T},
    colname::String,
    _...
) where {T <: Union{Union{<:Missing, <:Integer}, <:Integer}}

Collect the necessary arguments to provide to the _calculate_totals!() function for count-based columns. Uses _... varargs to denote that additional arguments (relevant for seroprevalence calculations in other methods of this function) might be passed but are not used in this specific method for integer/count columns.

Arguments

  • col::Vector{T}: The column vector of counts.

  • colname::String: The name of the column.

  • _...: Varargs for unused parameters in this method.

Returns a Try.Ok containing a tuple (col, colname) to be unpacked and passed to _calculate_totals!.

source
FMDData._combine_error_messages Method
julia
_combine_error_messages(arr_of_errs::AbstractVector{T}; filter_ok = false) where {T <: Try.InternalPrelude.AbstractResult}

Internal function. Combines error messages from a vector of Try results into a single string.

This is useful for aggregating multiple errors into a single, more informative error message.

Arguments

  • arr_of_errs: A vector of Try.Ok or Try.Err objects.

  • filter_ok: If true, Try.Ok results are filtered out before combining messages. Defaults to false.

source
FMDData._correct_serotype_counts! Method
julia
_correct_serotype_counts!(
    df::DataFrame;
    statename_column = :states_ut,
    allowed_serotypes = default_allowed_serotypes,
    reg::Regex
)

Correct any serotype counts that have been miscalculated during the inferral steps, arising from rounding errors in the provided seroprevalence numbers that are then translated into counts to difference between initial and later dataframes. If the pre or post counts for all serotypes are 0, then all serotype specific counts must be 0 as well, so correct.

source
FMDData._log_try_error Function
julia
_log_try_error(res, type::Symbol = :Error; unwrap_ok = true)

Internal function. Checks a Try result. If it's an Err, it logs the error message and returns the unwrapped error. If it's an Ok, it returns the unwrapped value by default.

This function helps manage control flow by logging non-critical errors without halting execution, while still allowing critical errors to be propagated.

Arguments

  • res: The Try.Ok or Try.Err object to check.

  • type::Symbol: The logging level to use if res is an Err. Can be :Error, :Warn, or :Info. Defaults to :Error.

  • unwrap_ok::Bool: If true, returns the unwrapped value of an Ok result. If false, returns the Try.Ok object itself. Defaults to true.

source
FMDData._totals_row_selectors Function
julia
_totals_row_selectors(
    df::DataFrame,
    column::Symbol = :states_ut,
    totals_key = "total";
    allowed_serotypes = vcat("all", default_allowed_serotypes),
    reg::Regex

)

Internal function to extract the totals row and the subset of dataframe rows that match the regex.

source
FMDData._unwrap_err_or_empty_str Method
julia
_unwrap_err_or_empty_str(res)

Internal funciton. Unwraps a Try.Err to get its error message, or returns an empty string for a Try.Ok.

This function is a helper for _combine_error_messages, ensuring that only error messages are included in the final combined string.

source
FMDData.collect_all_present_serotypes Function
julia
collect_all_present_serotypes(df::DataFrame, reg::Regex)

Return a vector of all column names that contain serotype information specified in the regex.

source
FMDData.correct_state_name Method
julia
correct_state_name(
    input_name::String,
    states_dict::Dict = FMDData.states_dict
)

Check if a state name is correctly spelled, or previously characterized and matched with a correct name. Returns the correct name if possible, or errors.

source