repo_stats API
citation_metrics
- class repo_stats.citation_metrics.ADSCitations(token, cache_dir)[source]
Class for getting, processing and aggregating citation data from the NASA ADS database for a given set of papers.
- Parameters:
- aggregate_citations(bibcode, metric='bibcode, pubdate, pub, author, title')[source]
Get, process and aggregate citation data in ‘metric’ for all papers in ‘bibcode’.
- Parameters:
- Returns:
all_stats – Individual and aggregated citation statistics across all papers in ‘bibcode’
- Return type:
- get_citations(bib, metric)[source]
Get citation data for a paper with the identifier ‘bib’ by querying the ADS API.
- Parameters:
- Returns:
all_cites – For each citation to the paper ‘bib’, a dictionary of ‘metric’ data
- Return type:
- process_citations(citations)[source]
Process (obtain statistics for) citation data in ‘citations’.
- Parameters:
citations (list of dict) – Dictionary of data for each citation to the reference paper
- Returns:
stats –
- Citation statistics:
’cite_all’: total number of citations
’cite_year’: citations in current year
’cite_month’: citations in previous month
’cite_per_year’: citations per year
’cite_bibcodes’: bibcodes of all citations
- Return type:
git_metrics
- class repo_stats.git_metrics.GitMetrics(token, repo_owner, repo_name, cache_dir)[source]
Class for getting and processing repository data (commit history, issues, pull requests, contributors) from GitHub for a given repository.
- Parameters:
- get_commits()[source]
Obtain the commit history for a repository with ‘git log’, and parse the output.
- get_commits_via_git_log(repo_local_path)[source]
Obtain the commit history for a repository with ‘git log’ and a local copy of the repository; and parse the output.
- Parameters:
repo_local_path (str) – Path to local copy of repository
- Returns:
dates (list of str) – Date of each commit
author_commits (dict) – Keys are the authors and the value is a list of the commits they have contributed
- get_issues_prs(item_type)[source]
Obtain the issue or pull request history for a GitHub repository by querying the GraphQL API.
- parse_log_line(line)[source]
Break an individual ‘git log’ line ‘line’ into its component parts (commit hash, date, author).
- process_commits(results, age_recent=90)[source]
Process (obtain statistics for) git commit data.
- Parameters:
- Returns:
stats –
- Commit statistics:
’age_recent_commit’: the input arg ‘age_recent’
’unique_authors’: each commit author, their number of commits and index of first commit
’new_authors’: list of authors with their first commit in ‘age_recent’
’n_recent_authors’: number of authors with commits in ‘age_recent’
’authors_per_month’: number of commit authors per month, over time
’new_authors_per_month’: number of new commit authors per month, over time
’multi_authors_per_month’: number of commit authors per month with >1 commit that month, over time
- Return type:
- process_issues_prs(results, items, labels, age_recent=90)[source]
Process (obtain statistics for) and aggregate issue and pull request data in ‘results’.
- Parameters:
results (list of dict) – A dictionary entry for each issue or pull request in the history (see
git_metrics.get_issues_prs
)items (list of str) – Names for the dictionary entries in the return ‘issues_prs’
labels (list of str) – GitHub labels (those added to an issue or pull request) to obtain additional statistics
age_recent (int, default=90) – Days before present used to categorize recent issue and pull request statistics
- Returns:
issues_prs –
- Statistics for issues and separately for pull requests:
’age_recent’: the input arg ‘age_recent’
’recent_open’: number of items (issues or pull requests) opened in ‘age_recent’
’recent_close’: number of items closed in ‘age_recent’
’open_per_month’: number of items opened per month, over time
’close_per_month’: number of items closed per month, over time
’label_open’: the input arg ‘labels’ and the number of currently open items with each label
- Return type:
plot
- repo_stats.plot.author_time_plot(commit_stats, repo_owner, repo_name, cache_dir, window_avg=7)[source]
Plot repository commit authors over time.
- Parameters:
commit_stats (dict) – Dictionary including commit statistics. See
git_metrics.Gits.process_commits()
repo_owner (str) – Owner of repository (for labels)
repo_name (str) – Name of repository (for labels and figure savename)
cache_dir (str) – Name of directory in which to cache figure
window_avg (int, default=7) – Number of months for rolling average of commit data. Enforced to be odd.
- Returns:
fig – The generated figure
- Return type:
plt.figure
instance
- repo_stats.plot.citation_plot(cite_stats, repo_name, cache_dir, names=None)[source]
Plot citations to referenced papers over time.
- Parameters:
cite_stats (dict) – Dictionary including citation statistics. See
citation_metrics.Cites.aggregate_citations()
repo_name (str) – Name of repository (for labels and figure savename)
cache_dir (str) – Name of directory in which to cache figure
names (list of str, optional) – Name of referenced papers (for plot legend)
- Returns:
fig – The generated figure
- Return type:
plt.figure
instance
- repo_stats.plot.open_issue_pr_plot(issue_pr_stats, repo_name, cache_dir)[source]
Plot a bar chart of a repository’s currently open issues and pull requests.
- Parameters:
- Returns:
fig – The generated figure
- Return type:
plt.figure
instance
- repo_stats.plot.issue_pr_time_plot(issue_pr_stats, repo_owner, repo_name, cache_dir, window_avg=7)[source]
Plot a repository’s number of issues and pull requests open and closed over time.
- Parameters:
issue_pr_stats (list of dict) – Statistics for issues and pull requests (see
git_metrics.Gits.process_issues_prs
)repo_owner (str) – Owner of repository (for labels)
repo_name (str) – Name of repository (for labels and figure savename)
cache_dir (str) – Name of directory in which to cache figure
window_avg (int, default=7) – Number of months for rolling average of commit data. Enforced to be odd.
- Returns:
fig – The generated figure
- Return type:
plt.figure
instance
runner
user_stats
- class repo_stats.user_stats.StatsImage(template_image, font)[source]
Class for updating a template image (e.g. to be displayed in a GitHub README) with repository and citation statistics.
- Parameters:
- draw_text(coords, text, text_color=None, font=None, **kwargs)[source]
Convenience wrapper for ‘PIL.ImageDraw.Draw’.
utilities
- repo_stats.utilities.fill_missed_months(unique_output)[source]
For an output of ‘np.unique(x, return_counts=True)’ where ‘x’ is a list of dates of the format ‘2024-01’, fill in months missing in this list and set their count to 0.
- repo_stats.utilities.rolling_average(unaveraged, window)[source]
Obtain a rolling average of ‘unaveraged’ data in a sliding window of index length ‘window’.