YugiQuery
- class CG(value)
Bases:
EnumEnum representing the card game formats.
- CG
Both TCG and OCG.
- Type:
str
- TCG
The ‘trading card game’ type.
- Type:
str
- OCG
The ‘official card game’ type.
- Type:
str
- adjust_lightness(color: str, amount: float = 0.5)
Adjust the lightness of a given color by a specified amount.
- Parameters:
color (str) – The color to be adjusted, in string format.
amount (float) – The amount by which to adjust the lightness of the color. Default value is 0.5.
- Returns:
The adjusted color in RGB format.
- Return type:
tuple
- Raises:
KeyError – If the specified color is not a valid Matplotlib color name.
- adjust_yaxis(ax: Axes, ydif: float, v: float)
Shift the y-axis of a subplot by a specified amount, while maintaining the location of a specified point.
- Parameters:
ax (AxesSubplot) – The subplot whose y-axis is to be adjusted.
ydif (float) – The amount by which to adjust the y-axis.
v (float) – The location of the point whose position should remain unchanged.
- Returns:
None
- align_yaxis(ax1: Axes, v1: float, ax2: Axes, v2: float)
Adjust the y-axis of two subplots so that the specified values in each subplot are aligned.
- Parameters:
ax1 (AxesSubplot) – The first subplot.
v1 (float) – The value in ax1 that should be aligned with v2 in ax2.
ax2 (AxesSubplot) – The second subplot.
v2 (float) – The value in ax2 that should be aligned with v1 in ax1.
- Returns:
None
- arrow_plot(arrows: Series, figsize: Tuple[int, int] = (6, 6), **kwargs)
Create a polar plot to visualize the frequency of each arrow direction in a pandas Series.
- Parameters:
arrows (pandas.Series) – A pandas Series containing arrow symbols as string data type.
figsize (Tuple[int, int], optional) – The width and height of the figure. Defaults to (6, 6).
**kwargs – Additional keyword arguments to be passed to the bar() method.
- Returns:
Displays the generated plot.
- Return type:
None
- assure_repo()
Assures the script is inside a git repository. Initializes a repository if one is not found.
- Raises:
Exception – For any unexpected errors.
- Returns:
None
- benchmark(timestamp: Arrow, report: str | None = None)
Records the execution time of a report and saves the data to a JSON file.
- Parameters:
timestamp (arrow.Arrow) – The timestamp when the report execution began. report (str): The name of the report being benchmarked. If None, tries obtaining report name from JPY_SESSION_NAME environment variable.
- Returns:
None
- boxplot(df, mean=True, **kwargs)
Plots a box plot of a given DataFrame using seaborn, with the year of the Release column on the x-axis and the remaining column on the y-axis.
- Parameters:
df (pandas.DataFrame) – The input DataFrame containing the Release dates and another numeric column.
mean (bool, optional) – If True, plots a line representing the mean of each box. Defaults to True.
**kwargs – Additional keyword arguments to pass to seaborn.boxplot().
- Returns:
None
- Raises:
ValueError – If the DataFrame has no Release column.
- card_query(default: str | None = None, *args, **kwargs)
Builds a string of arguments to be passed to the yugipedia Wiki API for a card search query.
- Parameters:
default (str, optional) – The default card type to build a query string for. Can be one of {‘spell’, ‘trap’, ‘st’, ‘monster’, ‘skill’, ‘counter’, ‘speed’, ‘rush’, None}. Defaults to None.
*args – Additional positional arguments to be passed to the API.
**kwargs – Additional keyword arguments to be passed to the API.
- Raises:
ValueError – If default is not a valid card type.
- Returns:
A string containing the arguments to be passed to the API for the card search query.
- Return type:
str
- check_API_status()
Checks if the API is running and reachable by making a query to retrieve site information. If the API is up and running, returns True. If the API is down or unreachable, returns False and prints an error message with details.
- Returns:
True if the API is up and running, False otherwise.
- Return type:
bool
- cleanup_data(dry_run=False)
Cleans up data files, keeping only the most recent file from each month and week.
- Parameters:
dry_run (bool) – If True, the function will only print the files that would be deleted without actually deleting them. Defaults to False.
- Returns:
None
- commit(files: str | List[str], commit_message: str | None = None)
Commits the specified files to the git repository after staging them.
- Parameters:
files (Union[str, List[str]]) – A list of file paths to be committed.
commit_message (str, optional) – The commit message. If not provided, a default message will be used.
- Raises:
git.InvalidGitRepositoryError – If the PARENT_DIR is not a git repository.
git.GitCommandError – If an error occurs while committing the changes.
Exception – For any other unexpected errors.
- Returns:
None
- condense_benchmark(benchmark: dict)
Condenses a benchmark dictionary by calculating the weighted average and total weight for each key.
- Parameters:
benchmark (dict) – A dictionary containing benchmark data.
- Returns:
The condensed benchmark dictionary with updated entries.
- Return type:
dict
- condense_changelogs(files: DataFrame)
Condenses multiple changelog files into a consolidated dataframe and generates a new filename.
- Parameters:
files (pd.DataFrame) – A dataframe containing the changelog files.
- Returns:
A tuple containing the consolidated changelog dataframe and the new filename.
- Return type:
Tuple[pd.DataFrame, str]
- async download_images(file_names: DataFrame, save_folder: str = '../images/', max_tasks: int = 10)
Downloads a set of images given their names and saves them to a specified folder.
- Parameters:
file_names (pandas.DataFrame) – A DataFrame containing the names of the image files to be downloaded.
save_folder (str) – The path to the folder where the downloaded images will be saved. Defaults to “../images/”.
max_tasks (int) – The maximum number of images to download at once. Defaults to 10.
- Returns:
None
- extract_category_bool(x: List[str])
Extracts a boolean value from a list of strings that represent a boolean value. If the first string in the list is “t”, returns True. If the first string in the list is “f”, returns False. Otherwise, returns np.nan.
- Parameters:
x (List[str]) – The input list of strings to extract a boolean value from.
- Returns:
The extracted boolean value.
- Return type:
Union[bool, np.nan]
- extract_fulltext(x: List[Dict[str, Any] | str], multiple: bool = False)
Extracts fulltext from a list of dictionaries or strings. If multiple is True, returns a sorted tuple of all fulltexts. Otherwise, returns the first fulltext found, with leading/trailing whitespaces removed. If the input list is empty, returns np.nan.
- Parameters:
x (List[Union[Dict[str, Any], str]]) – A list of dictionaries or strings to extract fulltext from.
multiple (bool) – If True, return a tuple of all fulltexts. Otherwise, return the first fulltext. Default is False.
- Returns:
The extracted fulltext(s).
- Return type:
str or Tuple[str] or np.nan
- extract_misc(x: str | List[str] | Tuple[str])
Extracts the misc properties of a card. Checks whether the input contains the values “Legend Card” or “Requires Maximum Mode” and creates a boolean table.
- Parameters:
x (Union[str, List[str], Tuple[str]]) – The Misc values to generate the boolean table from.
- Returns:
A pandas Series of boolean values indicating whether “Legend Card” and “Requires Maximum Mode” are present in the input.
- Return type:
pd.Series
- extract_primary_type(x: str | List[str] | Tuple[str])
Extracts the primary type of a card. If the input is a list or tuple, removes “Pendulum Monster” and “Maximum Monster” from the list. If the input is a list or tuple with only one element, returns that element. If the input is a list or tuple with multiple elements, returns the first element that is not “Effect Monster”. Otherwise, returns the input.
- Parameters:
x (Union[str, List[str], Tuple[str]]) – The input type(s) to extract the primary type from.
- Returns:
The extracted primary type(s).
- Return type:
Union[str, List[str]]
- extract_results(response: Response)
Extracts the relevant data from the response object and returns it as a Pandas DataFrame.
- Parameters:
response (requests.Response) – The response object obtained from making a GET request to the Yu-Gi-Oh! Wiki API.
- Returns:
A DataFrame containing the relevant data extracted from the response object.
- Return type:
pd.DataFrame
- fetch_all_set_lists(cg: CG = CG.CG, step: int = 40, **kwargs)
Fetches all set lists for a given card game.
- Parameters:
cg (CG, optional) – The card game to fetch set lists for. Defaults to CG.ALL.
step (int, optional) – The number of sets to fetch at once. Defaults to 50.
**kwargs – Additional keyword arguments to pass to fetch_set_list_pages and fetch_set_lists.
- Returns:
A DataFrame containing all set lists for the specified card game.
- Return type:
pd.DataFrame
- Raises:
Any exceptions raised by fetch_set_list_pages() or fetch_set_lists(). –
- fetch_backlinks(titles: List[str])
Fetches backlinks for a list of page titles.
- Parameters:
titles (List[str]) – A list of titles.
- Returns:
A dictionary mapping backlink titles to their corresponding target titles.
- Return type:
Dict[str, str]
- fetch_bandai(limit: int = 200, *args, **kwargs)
Fetch Bandai cards.
- Parameters:
limit (int, optional) – An integer that represents the maximum number of results to fetch. Defaults to 200.
*args – Additional properties to query.
**kwargs – keyword arguments to disable specific properties from query. Remaining keword arguments are passed to fetch_properties()
- Returns:
A pandas DataFrame object containing the properties of the fetched Bandai cards.
- Return type:
pandas.DataFrame
- fetch_categorymembers(category: str, namespace: int = 0, step: int = 500, iterator=None, debug: bool = False)
Fetches members of a category from the API by making iterative requests with a specified step size until all members are retrieved.
- Parameters:
category (str) – The category to retrieve members for.
namespace (int, optional) – The namespace ID to filter the members by. Defaults to 0 (main namespace).
step (int, optional) – The number of members to retrieve in each request. Defaults to 500.
iterator (tqdm.std.tqdm, optional) – A tqdm iterator to display progress updates. Defaults to None.
debug (bool, optional) – If True, prints the URL of each request for debugging purposes. Defaults to False.
- Returns:
A DataFrame containing the members of the category.
- Return type:
pandas.DataFrame
- fetch_counter(counter_query: str | None = None, cg=CG.CG, step: int = 500, limit: int = 5000, **kwargs)
Fetch counter cards based on query and properties of the cards.
- Parameters:
counter_query (str, optional) – A string representing a SMW query to search for. Defaults to None.
step (int, optional) – An integer that represents the number of results to fetch at a time. Defaults to 500.
limit (int, optional) – An integer that represents the maximum number of results to fetch. Defaults to 5000.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A pandas DataFrame object containing the properties of the fetched counter cards.
- Return type:
pandas.DataFrame
- fetch_errata(errata: str = 'all', step: int = 500, **kwargs)
Fetches errata information from the yuipedia Wiki API.
- Parameters:
errata (str) – The type of errata information to fetch. Valid values are ‘name’, ‘type’, and ‘all’. Defaults to ‘all’.
step (int) – The number of results to fetch in each API call. Defaults to 500.
**kwargs – Additional keyword arguments to pass to fetch_categorymembers.
- Returns:
A pandas DataFrame containing a boolean table indicating whether each card has errata information for the specified type.
- fetch_monster(monster_query: str | None = None, cg: CG = CG.CG, step: int = 500, limit: int = 5000, exclude_token=True, **kwargs)
Fetch monster cards based on query and properties of the cards.
- Parameters:
monster_query (str, optional) – A string representing a SMW query to search for. Defaults to None.
cg (CG, optional) – An Enum that represents the card game to fetch cards from. Defaults to CG.ALL.
step (int, optional) – An integer that represents the number of results to fetch at a time. Defaults to 500.
limit (int, optional) – An integer that represents the maximum number of results to fetch. Defaults to 5000.
exclude_token (bool, optional) – A boolean that determines whether to exclude Monster Tokens or not. Defaults to True.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A pandas DataFrame object containing the properties of the fetched monster cards.
- Return type:
pandas.DataFrame
- fetch_properties(condition: str, query: str, step: int = 500, limit: int = 5000, iterator=None, include_all: bool = False, debug: bool = False)
Fetches properties from the API by making iterative requests with a specified step size until a specified limit is reached.
- Parameters:
condition (str) – The query condition to filter the properties by.
query (str) – The query to retrieve the properties.
step (int, optional) – The number of properties to retrieve in each request. Defaults to 500.
limit (int, optional) – The maximum number of properties to retrieve. Defaults to 5000.
iterator (tqdm.std.tqdm, optional) – A tqdm iterator to display progress updates. Defaults to None.
include_all (bool, optional) – If True, includes all properties in the DataFrame. If False, includes only properties that have values. Defaults to False.
debug (bool, optional) – If True, prints the URL of each request for debugging purposes. Defaults to False.
- Returns:
A DataFrame containing the properties matching the query and condition.
- Return type:
pandas.DataFrame
- fetch_rarities_dict(rarities_list: List[str] = [])
Fetches backlinks and redirects for a list of rarities, including abbreviations, to generate a map of rarity abbreviations to their corresponding names.
- Parameters:
rarities_list (List[str]) – A list of rarities.
- Returns:
A dictionary mapping rarity abbreviations to their corresponding names.
- Return type:
Dict[str, str]
- fetch_redirects(titles: List[str])
Fetches redirects for a list of page titles.
- Parameters:
titles (List[str]) – A list of titles.
- Returns:
A dictionary mapping source titles to their corresponding redirect targets.
- Return type:
Dict[str, str]
- fetch_rush(rush_query: str | None = None, step: int = 500, limit: int = 5000, **kwargs)
Fetches Rush Duel cards from the Yu-Gi-Oh! Wikia API.
- Parameters:
rush_query (str) – A search query to filter the results. If not provided, it defaults to “rush”.
step (int) – The number of results to fetch in each API call. Defaults to 500.
limit (int) – The maximum number of results to fetch. Defaults to 5000.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A pandas DataFrame containing the fetched Rush Duel cards.
- fetch_set_info(sets: List[str], extra_info: List[str] = [], step: int = 15, **kwargs)
Fetches information for a list of sets.
- Parameters:
sets (List[str]) – A list of set names to fetch information for.
extra_info (List[str], optional) – A list of additional information to fetch for each set. Defaults to an empty list.
step (int, optional) – The number of sets to fetch information for at once. Defaults to 15.
**kwargs – Additional keyword arguments.
- Returns:
A DataFrame containing information for all sets in the list.
- Return type:
pd.DataFrame
- Raises:
Any exceptions raised by requests.get(). –
- fetch_set_list_pages(cg: CG = CG.CG, step: int = 500, limit=5000, **kwargs)
Fetches a list of ‘Set Card Lists’ pages from the yugipedia Wiki API.
- Parameters:
cg (CG) – A member of the CG enum representing the card game for which set lists are being fetched.
step (int) – The number of pages to fetch in each API request.
limit (int) – The maximum number of pages to fetch.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A DataFrame containing the titles of the set list pages.
- Return type:
pd.DataFrame
- fetch_set_lists(titles: List[str], **kwargs)
Fetches card set lists from a list of page titles.
- Parameters:
titles (List[str]) – A list of page titles from which to fetch set lists.
**kwargs – Additional keyword arguments.
- Returns:
A DataFrame containing the parsed card set lists.
- Return type:
pd.DataFrame
- fetch_skill(skill_query: str | None = None, step: int = 500, limit: int = 5000, **kwargs)
Fetches skill cards from the yugipedia Wiki API.
- Parameters:
skill_query (str) – A string representing a SMW query to search for. Defaults to None.
step (int) – The number of results to fetch in each API call. Defaults to 500.
limit (int) – The maximum number of results to fetch. Defaults to 5000.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A pandas DataFrame containing the fetched skill cards.
- fetch_speed(speed_query: str | None = None, step: int = 500, limit: int = 5000, **kwargs)
Fetches TCG Speed Duel cards from the yugipedia Wiki API.
- Parameters:
speed_query (str) – A string representing a SMW query to search for. Defaults to None.
step (int) – The number of results to fetch in each API call. Defaults to 500.
limit (int) – The maximum number of results to fetch. Defaults to 5000.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A pandas DataFrame containing the fetched TCG Speed Duel cards.
- fetch_st(st_query: str | None = None, st: str = 'both', cg: CG = CG.CG, step: int = 500, limit: int = 5000, **kwargs)
Fetch spell or trap cards based on query and properties of the cards.
- Parameters:
st_query (str, optional) – A string representing a SMW query to search for. Defaults to None.
st (str, optional) – A string representing the type of cards to fetch, either “spell”, “trap”, “both”, or “all”. Defaults to “both”.
cg (CG, optional) – An Enum that represents the card game to fetch cards from. Defaults to CG.ALL.
step (int, optional) – An integer that represents the number of results to fetch at a time. Defaults to 500.
limit (int, optional) – An integer that represents the maximum number of results to fetch. Defaults to 5000.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A pandas DataFrame object containing the properties of the fetched spell/trap cards.
- Return type:
pandas.DataFrame
- Raises:
ValueError – Raised if the “st” argument is not one of “spell”, “trap”, “both”, or “all”.
- fetch_token(token_query: str | None = None, cg=CG.CG, step: int = 500, limit: int = 5000, **kwargs)
Fetch token cards based on query and properties of the cards.
- Parameters:
token_query (str, optional) – A string representing a SWM query to search for. Defaults to None.
step (int, optional) – An integer that represents the number of results to fetch at a time. Defaults to 500.
limit (int, optional) – An integer that represents the maximum number of results to fetch. Defaults to 5000.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A pandas DataFrame object containing the properties of the fetched token cards.
- Return type:
pandas.DataFrame
- fetch_unusable(query: str | None = None, cg: CG = CG.CG, filter=True, step: int = 500, limit: int = 5000, **kwargs)
Fetch unusable cards based on query and properties of the cards. Unusable cards include “Strategy cards”, “Tip cards”, “Card Checklists”, etc, which are not actual cards. The filter option enables filtering those out and keeping only cards such as Duelist Kingdom “Ticket cards”, old video-game promo “Character cards” and “Non-game cards” which have the layout of a real card, such as “Everyone’s King”. This criteria is not free of ambiguity.
- Parameters:
query (str, optional) – A string representing a SMW query to search for. Defaults to None.
cg (CG, optional) – An Enum that represents the card game to fetch cards from. Defaults to CG.ALL.
filter (bool, optional) – Keep only “Character Cards”, “Non-game cards” and “Ticket Cards”.
step (int, optional) – An integer that represents the number of results to fetch at a time. Defaults to 500.
limit (int, optional) – An integer that represents the maximum number of results to fetch. Defaults to 5000.
**kwargs – Additional keyword arguments to pass to fetch_properties.
- Returns:
A pandas DataFrame object containing the properties of the fetched spell/trap cards.
- Return type:
pandas.DataFrame
Generates a Markdown footer with a timestamp.
- Parameters:
timestamp (arrow.Arrow, optional) – The timestamp to use. If None, uses the current time. Defaults to None.
- Returns:
The generated Markdown footer.
- Return type:
Markdown
- format_artwork(row: Series)
Formats a row of a dataframe that contains “alternate artworks” and “edited artworks” columns. If the “alternate artworks” column(s) in the row contain at least one “True” value, adds “Alternate” to the result tuple. If the “edited artworks” column(s) in the row contain at least one “True” value, adds “Edited” to the result tuple. Returns the result tuple.
- Parameters:
row (pd.Series) – A row of a dataframe that contains “alternate artworks” and “edited artworks” columns.
- Returns:
The formatted row as a tuple.
- Return type:
Tuple[str]
- format_df(input_df: DataFrame, include_all: bool = False)
Formats a dataframe containing card information. Returns a new dataframe with specific columns extracted and processed.
- Parameters:
input_df (pd.DataFrame) – The input dataframe to format.
include_all (bool) – If True, include all unspecified columns in the output dataframe. Default is False.
- Returns:
The formatted dataframe.
- Return type:
pd.DataFrame
- format_errata(row: Series)
Formats errata information from a pandas Series and returns a tuple of errata types.
- Parameters:
row (pd.Series) – A pandas Series containing errata information for a single card.
- Returns:
Tuple of errata types if any errata information is present in the input Series, otherwise np.nan.
- Return type:
Tuple[str]
- generate_changelog(previous_df: DataFrame, current_df: DataFrame, col: str | List[str])
Generates a changelog DataFrame by comparing two DataFrames based on a specified column.
- Parameters:
previous_df (pd.DataFrame) – A DataFrame containing the previous version of the data.
current_df (pd.DataFrame) – A DataFrame containing the current version of the data.
col (Union[str, List[str]]) – The name of the column to compare the DataFrames on.
- Returns:
A DataFrame containing the changes made between the previous and current versions of the data. The DataFrame will have the following columns: the specified column name, the modified data, and the indicator for whether the data is new or modified renamed as version (either “Old” or “New”). If there are no changes, the function will return a DataFrame with no rows.
- Return type:
pd.DataFrame
- generate_rate_grid(dy: DataFrame, ax: Axes, xlabel: str = 'Date', size: str = '150%', pad: int = 0, colors: List[str] | None = None, cumsum: bool = True)
Generate a grid of subplots displaying yearly and monthly rates from a Pandas DataFrame.
- Parameters:
dy (pd.DataFrame) – A Pandas DataFrame containing the data to be plotted.
ax (AxesSubplot) – The subplot onto which to plot the grid.
xlabel (str) – The label to be used for the x-axis. Default value is ‘Date’.
size (str) – The size of the bottom subplot as a percentage of the top subplot. Default value is ‘150%’.
pad (int) – The amount of padding between the two subplots in pixels. Default value is 0.
colors (List[str]) – A list of colors to be used in the plot. If not provided, the default Matplotlib color cycle is used. Default value is None.
cumsum (bool) – If True, plot the cumulative sum of the data. If False, plot only the yearly and monthly rates. Default value is True.
- Returns:
None
- header(name: str | None = None)
Generates a Markdown header with a timestamp and the name of the notebook (if provided).
- Parameters:
name (str, optional) – The name of the notebook. If None, attempts to extract the name from the environment variable JPY_SESSION_NAME. Defaults to None.
- Returns:
The generated Markdown header.
- Return type:
Markdown
- load_corrected_latest(name_pattern: str, tuple_cols: List[str] = [])
Loads the most recent data file matching the specified name pattern and applies corrections.
- Parameters:
name_pattern (str) – Data file name pattern to load.
tuple_cols (List[str]) – List of columns containing tuple values to apply literal_eval.
- Returns:
A tuple containing the loaded dataframe and the timestamp of the file.
- Return type:
Tuple[pd.DataFrame, arrow.Arrow]
- load_json(json_file: str)
Load data from a JSON file.
- Parameters:
json_file (str) – The file path to the JSON file.
- Returns:
A dictionary containing the data from the JSON file. If the file does not exist, an empty dictionary is returned.
- Return type:
dict
- load_secrets(requested_secrets: List[str] = [], secrets_file: str | None = None, required: bool = False)
Load secrets from environment variables and/or a .env file.
The secrets can be specified by name using the requested_secrets argument, which should be a list of strings. If requested_secrets is not specified, all available secrets will be returned.
The secrets_file argument is the path to a .env file containing additional secrets to load. If secrets_file is specified and the file exists, the function will load the secrets from the file and merge them with the secrets loaded from the environment variables giving priority to secrets obtained from the .env file.
The required argument is a boolean or list of booleans indicating whether each requested secret is required to be present. If required is True, a KeyError will be raised if the secret is not found. If required is False or not specified, missing secrets will be skipped.
- Parameters:
requested_secrets (List[str], optional) – A list of names of the secrets to retrieve. If empty or not specified, all available secrets will be returned. Defaults to [].
secrets_file (str, optional) – The path to a .env file containing additional secrets to load. Defaults to None.
required (bool or List[bool], optional) – A boolean or list of booleans indicating whether each requested secret is required to be present. If True, a KeyError will be raised if the secret is not found. If False or not specified, missing secrets will be skipped. Defaults to False.
- Returns:
A dictionary containing the requested secrets as key-value pairs.
- Return type:
Dict[str, str]
- Raises:
KeyError – If a required secret is not found in the environment variables or .env file.
- make_filename(report: str, timestamp: Arrow, previous_timestamp: Arrow | None = None)
Generates a standardized filename based on the provided parameters.
- Parameters:
report (str) – The name or identifier of the report.
timestamp (arrow.Arrow) – The timestamp to be included in the filename.
previous_timestamp (arrow.Arrow) – The previous timestamp, if applicable. Defaults to None.
- Returns:
The generated filename.
- Return type:
str
- md5(name: str)
Generate the MD5 hash of a string.
- Parameters:
name (str) – The string to hash.
- Returns:
The MD5 hash of the string.
- Return type:
str
- merge_errata(input_df: DataFrame, input_errata_df: DataFrame)
Merges errata information from an input errata DataFrame into an input DataFrame based on card names.
- Parameters:
input_df (pd.DataFrame) – A pandas DataFrame containing card information.
input_errata_df (pd.DataFrame) – A pandas DataFrame containing errata information.
- Returns:
A pandas DataFrame with errata information merged into it.
- Return type:
pd.DataFrame
- merge_set_info(input_df: DataFrame, input_info_df: DataFrame)
Merges set information from an input set info DataFrame into an input set list DataFrame based on set and region.
- Parameters:
input_df (pd.DataFrame) – A pandas DataFrame containing set lists.
input_info_df (pd.DataFrame) – A pandas DataFrame containing set information.
- Returns:
A pandas DataFrame with set information merged into it.
- Return type:
pd.DataFrame
- rate_plot(dy: DataFrame, figsize: Tuple[int, int] = (16, 6), title: str | None = None, xlabel: str = 'Date', colors: List[str] | None = None, cumsum: bool = True, bg: DataFrame | None = None, vlines: DataFrame | None = None)
Creates a single plot to visualize the rate of change over time of a single variable in a pandas DataFrame.
- Parameters:
dy (pd.DataFrame) – The pandas DataFrame containing the data to plot.
figsize (Tuple[int, int]) – The size of the figure to create. Default is (16, 6).
title (str) – The title of the figure. Default is None.
xlabel (str) – The label of the x-axis. Default is ‘Date’.
colors (List[str]) – The list of colors to use for the lines. If None, default colors are used.
cumsum (bool) – Whether to plot the cumulative sum of the data. Default is True.
bg (pd.DataFrame) – A DataFrame containing the background shading data. Default is None.
vlines (pd.DataFrame) – A DataFrame containing the vertical line data. Default is None.
- Returns:
Displays the generated plot.
- Return type:
None
- rate_subplots(df: DataFrame, figsize: Tuple[int, int] | None = None, title: str = '', xlabel: str = 'Date', colors: List[str] | None = None, cumsum: bool = True, bg: DataFrame | None = None, vlines: DataFrame | None = None)
Creates a grid of subplots to visualize rates of change over time of multiple variables in a pandas DataFrame.
- Parameters:
df (pd.DataFrame) – The pandas DataFrame containing the data to plot.
figsize (Tuple[int, int] or None) – The size of the figure to create. If None, default size is (16, len(df.columns)*2*(1+cumsum)).
title (str) – The title of the figure. Default is an empty string.
xlabel (str) – The label of the x-axis. Default is ‘Date’.
colors (List[str]) – The list of colors to use for the lines. If None, default colors are used.
cumsum (bool) – Whether to plot the cumulative sum of the data. Default is True.
bg (pd.DataFrame) – A DataFrame containing the background shading data. Default is None.
vlines (pd.DataFrame) – A DataFrame containing the vertical line data. Default is None.
- Returns:
Displays the generated plot.
- Return type:
None
- run(reports: str | List[str] = 'all', progress_handler=None, telegram_first: bool = False, suppress_contribs: bool = False, cleanup: bool | str = False, dry_run: bool = False, **kwargs)
Executes all notebooks in the source directory that match the specified report, updates the page index to reflect the last execution timestamp, and clean up redundant data files.
- Parameters:
reports (str, optional) – The report to generate. Defaults to ‘all’.
progress_handler (function, optional) – A progress handler function to report execution progress. Defaults to None.
telegram_first (bool, optional) – Defaults to False.
suppress_contribs (bool, optional) – Defaults to False.
cleanup (Union[bool,str], optional) – whether to cleanup data files after execution. If True, perform cleanup, if False, doesn’t perform cleanup. If ‘auto’, performs cleanup if there are more than 4 data files for each report (assuming one per week). Defaults to ‘auto’.
dry_run (bool, optional) – dry_run flag to pass to cleanup_data method call. Defaults to False.
**kwargs – Additional keyword arguments to pass to run_notebook.
- Returns:
This function does not return a value.
- Return type:
None
- run_notebooks(reports: str | List[str] = 'all', progress_handler: Callable | None = None, telegram_first: bool = False, suppress_contribs: bool = False, **kwargs)
Execute specified Jupyter notebooks in the source directory using Papermill.
- Parameters:
reports (Union[str, List[str]]) – List of notebooks to execute or ‘all’ to execute all notebooks in the source directory. Default is ‘all’.
progress_handler (callable) – An optional callable to provide progress bar functionality. Default is None.
telegram_first (bool, optional) – Default is False.
suppress_contribs (bool, optional) – Default is False.
**kwargs – Additional keyword arguments containing secrets key-value pairs to pass to TQDM contrib iterators.
- Returns:
None
- save_notebook()
Save the current notebook opened in JupyterLab to disk.
- Parameters:
None
- Returns:
None
- separate_words_and_acronyms(strings: List[str])
Separates a list of strings into words and acronyms.
- Parameters:
strings (List[str]) – A list of strings to be categorized.
- Returns:
- A tuple containing two lists:
The first list contains words (strings starting with an uppercase letter followed by lowercase letters).
The second list contains acronyms (strings not meeting the word criteria).
- Return type:
Tuple[List[str], List[str]]
- update_index()
Update the index.md and README.md files with a table of links to all HTML reports in the parent directory. Also update the @REPORT_|_TIMESTAMP@ and @TIMESTAMP@ placeholders in the index.md file with the latest timestamp. If the update is successful, commit the changes to Git with a commit message that includes the timestamp. If there is no index.md or README.md files in the assets directory, print an error message and abort.
- Raises:
FileNotFoundError – If the “index.md” or “README.md” files in “assets” are not found.
- Returns:
None