repo_stats API
citation_metrics
- class repo_stats.citation_metrics.ADSCitations(token, cache_dir)[source]
- Class for getting, processing and aggregating citation data from the NASA ADS database for a given set of papers. - Parameters:
 - aggregate_citations(bibcode, metric='bibcode, pubdate, pub, author, title')[source]
- Get, process and aggregate citation data in ‘metric’ for all papers in ‘bibcode’. - Parameters:
- Returns:
- all_stats – Individual and aggregated citation statistics across all papers in ‘bibcode’ 
- Return type:
 
 - get_citations(bib, metric)[source]
- Get citation data for a paper with the identifier ‘bib’ by querying the ADS API. - Parameters:
- Returns:
- all_cites – For each citation to the paper ‘bib’, a dictionary of ‘metric’ data 
- Return type:
 
 - process_citations(citations)[source]
- Process (obtain statistics for) citation data in ‘citations’. - Parameters:
- citations (list of dict) – Dictionary of data for each citation to the reference paper 
- Returns:
- stats – - Citation statistics:
- ’cite_all’: total number of citations 
- ’cite_year’: citations in current year 
- ’cite_month’: citations in previous month 
- ’cite_per_year’: citations per year 
- ’cite_bibcodes’: bibcodes of all citations 
 
 
- Return type:
 
 
git_metrics
- class repo_stats.git_metrics.GitMetrics(token, repo_owner, repo_name, cache_dir)[source]
- Class for getting and processing repository data (commit history, issues, pull requests, contributors) from GitHub for a given repository. - Parameters:
 - get_commits()[source]
- Obtain the commit history for a repository with ‘git log’, and parse the output. 
 - get_commits_via_git_log(repo_local_path)[source]
- Obtain the commit history for a repository with ‘git log’ and a local copy of the repository; and parse the output. - Parameters:
- repo_local_path (str) – Path to local copy of repository 
- Returns:
- dates (list of str) – Date of each commit 
- author_commits (dict) – Keys are the authors and the value is a list of the commits they have contributed 
 
 
 - get_issues_prs(item_type)[source]
- Obtain the issue or pull request history for a GitHub repository by querying the GraphQL API. 
 - parse_log_line(line)[source]
- Break an individual ‘git log’ line ‘line’ into its component parts (commit hash, date, author). 
 - process_commits(results, age_recent=90)[source]
- Process (obtain statistics for) git commit data. - Parameters:
- Returns:
- stats – - Commit statistics:
- ’age_recent_commit’: the input arg ‘age_recent’ 
- ’unique_authors’: each commit author, their number of commits and index of first commit 
- ’new_authors’: list of authors with their first commit in ‘age_recent’ 
- ’n_recent_authors’: number of authors with commits in ‘age_recent’ 
- ’authors_per_month’: number of commit authors per month, over time 
- ’new_authors_per_month’: number of new commit authors per month, over time 
- ’multi_authors_per_month’: number of commit authors per month with >1 commit that month, over time 
 
 
- Return type:
 
 - process_issues_prs(results, items, labels, age_recent=90)[source]
- Process (obtain statistics for) and aggregate issue and pull request data in ‘results’. - Parameters:
- results (list of dict) – A dictionary entry for each issue or pull request in the history (see - git_metrics.get_issues_prs)
- items (list of str) – Names for the dictionary entries in the return ‘issues_prs’ 
- labels (list of str) – GitHub labels (those added to an issue or pull request) to obtain additional statistics 
- age_recent (int, default=90) – Days before present used to categorize recent issue and pull request statistics 
 
- Returns:
- issues_prs – - Statistics for issues and separately for pull requests:
- ’age_recent’: the input arg ‘age_recent’ 
- ’recent_open’: number of items (issues or pull requests) opened in ‘age_recent’ 
- ’recent_close’: number of items closed in ‘age_recent’ 
- ’open_per_month’: number of items opened per month, over time 
- ’close_per_month’: number of items closed per month, over time 
- ’label_open’: the input arg ‘labels’ and the number of currently open items with each label 
 
 
- Return type:
 
 
plot
- repo_stats.plot.author_time_plot(commit_stats, repo_owner, repo_name, cache_dir, window_avg=7)[source]
- Plot repository commit authors over time. - Parameters:
- commit_stats (dict) – Dictionary including commit statistics. See - git_metrics.Gits.process_commits()
- repo_owner (str) – Owner of repository (for labels) 
- repo_name (str) – Name of repository (for labels and figure savename) 
- cache_dir (str) – Name of directory in which to cache figure 
- window_avg (int, default=7) – Number of months for rolling average of commit data. Enforced to be odd. 
 
- Returns:
- fig – The generated figure 
- Return type:
- plt.figureinstance
 
- repo_stats.plot.citation_plot(cite_stats, repo_name, cache_dir, names=None)[source]
- Plot citations to referenced papers over time. - Parameters:
- cite_stats (dict) – Dictionary including citation statistics. See - citation_metrics.Cites.aggregate_citations()
- repo_name (str) – Name of repository (for labels and figure savename) 
- cache_dir (str) – Name of directory in which to cache figure 
- names (list of str, optional) – Name of referenced papers (for plot legend) 
 
- Returns:
- fig – The generated figure 
- Return type:
- plt.figureinstance
 
- repo_stats.plot.open_issue_pr_plot(issue_pr_stats, repo_name, cache_dir)[source]
- Plot a bar chart of a repository’s currently open issues and pull requests. - Parameters:
- Returns:
- fig – The generated figure 
- Return type:
- plt.figureinstance
 
- repo_stats.plot.issue_pr_time_plot(issue_pr_stats, repo_owner, repo_name, cache_dir, window_avg=7)[source]
- Plot a repository’s number of issues and pull requests open and closed over time. - Parameters:
- issue_pr_stats (list of dict) – Statistics for issues and pull requests (see - git_metrics.Gits.process_issues_prs)
- repo_owner (str) – Owner of repository (for labels) 
- repo_name (str) – Name of repository (for labels and figure savename) 
- cache_dir (str) – Name of directory in which to cache figure 
- window_avg (int, default=7) – Number of months for rolling average of commit data. Enforced to be odd. 
 
- Returns:
- fig – The generated figure 
- Return type:
- plt.figureinstance
 
runner
user_stats
- class repo_stats.user_stats.StatsImage(template_image, font)[source]
- Class for updating a template image (e.g. to be displayed in a GitHub README) with repository and citation statistics. - Parameters:
 - draw_text(coords, text, text_color=None, font=None, **kwargs)[source]
- Convenience wrapper for ‘PIL.ImageDraw.Draw’. 
 
utilities
- repo_stats.utilities.fill_missed_months(unique_output)[source]
- For an output of ‘np.unique(x, return_counts=True)’ where ‘x’ is a list of dates of the format ‘2024-01’, fill in months missing in this list and set their count to 0. 
- repo_stats.utilities.rolling_average(unaveraged, window)[source]
- Obtain a rolling average of ‘unaveraged’ data in a sliding window of index length ‘window’.