Aossie Scholar Extension — Google Summer of Code 2020

Aditya Bisoi
5 min readAug 29, 2020

--

Aossie Scholar is an application that helps determine the performance of scholars/ researchers effectively. It calculates different metrics, such as h-index, g-index, o-index, e-index to name a few, based on different factors, such as frequency of publications of a scholar, the number of citations on each paper, etc.

The app was originally built with Django. It scrapped raw data from the Google Scholar website and computed metrics in the background. This had its own limitations. The data scraping was slow and took approximately 10–15 seconds for new registration. Also, we were unclear about the data scraping policy of the website, even after carefully studying the ToS. So, an extension was needed for faster and easier access to the project, as well as compliance with the data rules of the website.

Here is the link to the GitLab repository-

Work Summary (Weekly)

Phase 1

Week 1: Developed popup and content scripts for the URL- specific behavior of the extension. Built a skeletal front-end template for showing this behavior. As proposed in the proposal, the extension now shows the Register and Search panels when on Google Scholar website, and the Search panel only when on any other website. The directory structure has been initially refactored into separate extension and backend folders.

Week 2: Introduced workflow automation in the project using Gulp. The project now has a server to watch for file changes in development and automatically compiles and compresses them to a production folder. The directory structure of the extension has been refactored into different folders. The documentation has also been updated to show the instructions to use the automated server during development.

Week 3: Developed the front-end of the popup according to the designs in the proposal. Added autocomplete feature for the Scholar’s name, i.e., the scholar name is already filled in the registration form on opening the popup window.

Week 4: Added data scraping to scrap raw data from the Google Scholar website. The extension extracts Scholar name, publication titles, years of publications, number of citations, profile picture URL, website URL, and workplace name.

Phase 2

Week 5: Developed a background script for opening a new tab for viewing the profile page after data scraping. Developed methods for calculating the different metrics from raw data, in the background script. Discovered that using a headless scraper for scraping co-author data is too costly.

Week 6: Refactored the metric calculation from background script to an independent script for the profile page only. Developed the entire front-end page for the profile, from scratch. Added methods to bind calculated metrics and other data to be displayed, into the profile page. Discovered the use of a famous open-source visualizing framework, ChartJS. The extension now opens a profile page and displays calculated metrics and graphs, along with other relevant scholar information.

Week 7: Refactored the previously used REST API format for ease of use. Using Django REST-Framework, changed the URLs to handle POST requests during new scholar registration. Added functionality to the dummy search form. The form now sends the search term to the background script, from where it will be sent as a GET request to the database.

Week 8: Completed the search functionality by adding the search function. The background script now sends a GET request to the database to find if the searched scholar is already registered, or not. If it finds a result, the data is fetched from the database and visualized in the profile view. The previous Django backend had a lot of redundant files, no longer to be used in the extension. So, refactored the entire backend, removing unnecessary files, folders, and libraries, and renaming them to more understandable and developer-friendly titles.

Phase 3

Week 9: Added some initial documentation for setting up the project locally. Since the directory structure was restructured entirely in the previous phase, changed local Gulpfile file paths. Changed the Scholar logo to a new, better, and cleaner logo after reviewing from the mentors. Since the publication data and metrics keep on changing in the Google Scholar page, I needed a way to update the values of an already registered scholar. So, I refactored the code to handle such cases. Now, the code first checks if the scholar is already registered. If not, it registers the scholar. If the scholar is already registered, it updated the values.

Week 10: Dropped the idea of creating a Slack app, after discussion with mentors, because the idea was not so relevant. Instead, I decided to add Continuous Integration to the project, which would later help a lot in keeping the code secure and maintain the code quality. Firstly, I added code linting and formatting using Prettier and Eslint. Both of them are outstanding at maintaining the quality of code. So, I decided to use them both, Eslint as a linter and Prettier as the code formatted. I had to adjust the configuration to make them work with each other without any collisions. I used Husky, which is a pre-commit GIT hook. It runs linting checks before committing a new code to the repository. This helps prevent unnecessary build fails in the pipeline after pushing the code. The developer can focus his code rather than on its format.

Week 11: Added tests to the project. I used JEST since it is one of the newest and most popular JavaScript testing libraries. I used Puppeteer for End-to-End(e2e) testing. This also took a significant amount of time, adjusting the configuration for them to work with each other. I created a Dockerfile for the project, which would install Google Chrome and xvfb in the test environment. Finally, I added the pipeline to automatically run some tests on pushing code to the repository. The pipeline has three stages- format, build, and test.

Week 12: Being in the final week, I added the final feature - Scholar Starring. This feature enables users to star(bookmark) a scholar. Bookmarked scholars can be found on pressing the ‘Star’ button on the extension popup. A new page showing the list of starred scholars appears. Clicking on a scholar's name redirects the user to his/her profile page. I also decided to add the ‘Search Results’ page, to show if multiple results appear during a search. Now, the most important component of a software project- the documentation. I added proper guidelines for using the project, as well as steps for installing a local development version of the project, building, testing, and formatting it. Finally, I deployed the project on the Chrome Web Store.

During my GSoC period, I addressed 9 major Issues(Bug Fixes and mostly new feature addition), with 12 Merge Requests and 200+ commits.

All my contributions can be found here

The issues addressed in the period can be found here

The MRs made (MERGED) can be found here

In the end, you can find the extension in Google Chrome Web Store here

Future Scope

The AOSSIE Scholar extension has been developed from scratch. Although it has all the functionalities intended, there’s some scope for future development-

  1. The front-end can be re-developed using a JS framework, such as ReactJS, AngularJS or VueJS
  2. Add some enhancements such as browser notifications
  3. Enable cross-browser support for the current Chrome-based extension

Finally, thank you, Manikaran Singh, Bruno Woltzenlogel Paleo, and Siwani Agrawal for mentoring me throughout my GSoC journey!

--

--