Johns Hopkins NPL Sentiment Analysis Java to Django Application

By on

Course Assessment using Data Exploration of Text (CADET)

cadet_img

Mock Up Github Repo

##What is CADET? The Course Assessment using Data Exploration of Text (CADET) is a tool which uses natural language processing to analyze student course feedback. The automated tool is able to process large datasets, categorize comments by sentiment and topic as well as course and instructor.

The intention of the original CADET program is to provide longitutional analysis and pedagogical directions from the collected course evaluations.

The current desktop CADET program has a user interface that cannot be run on web browsers. Furthermore, the CADET desktop UI is also missing validation and errors messages, so if ported to the web, will pose behavioral and interaction issues for end-users.

Specifications for the Dashboard Layer of the CADET Program This specification outlines a proposal for adding a new web user interface to replace the desktop user interface. The new web user interface needs to have the same capabilities as the desktop system. Those capabilities includes: uploading files, select options regarding natural language processing, and switching from the topics-comments view and instructors-comments view.

##My Role in the Project Core contributor for porting an Java app into Django impleentation for the Hopkins Course Assessment using Data Exploration of Text (CADET) program.

Implement the front-end layer with AJAX calls using the Chart.js library to display parsed data from the back-end layer.

##Problem Description In order to port the desktop system to a web system, CADET needs to address two major missing features. The first is structural, while the second is behavioral. The current desktop system uses a python PyQt5 library for its main desktop-based GUI. The current PyQt5 setup in the program currently cannot run on browsers due to missing templates in Django and a web library in PyQt5. Django is a modern web framework written in Python, used mostly for agile web development. Django framework expects a new structure of view templates in its directory. In addition to missing templates for the new Django framework, the desktop system does not display behavioral changes in the presentation of information most web users would expect. For example, error messages are not displayed in the CADET desktop GUI for users. This missing behavior makes it hard for users to understand why the program either crashes or fail to hide labels when not in used.

##Requirements The current dashboard heavily relies on PyQt5 for graphical output with Numpy and MatPlot Python libraries. The web dashboard still would use refactored and rewritten code related Numpy and MatPlot, but needs to replace PyQt5 with a complete front-end system–both in HTML templates, and Javascript to handle events–in the dashboard. PyQt5 library of Python will not be imported to the web.

##Functional Requirements: Create a dashboard UI that is compatible with current data visualization libraries from the Django framework.

Follow the branding guidelines of from official JHU branding in terms of color choice, typography, and graphic treatment. However, the web application should refrain from using any specific logos, until the university approves the application.

Adheres to the latest web user-experience best practices

Display options to easily navigate between upload view, options view, and distributions view. The distribution view incorporates two panels The first panel displays the instructor to comments distribution graph. The second panel displays the topics to comments distribution graph.

Display three different sentiments–positive, negative, and neutral–in the distribution graphs

The options view has these inputs: number of topics, words per topics, number of iterations

Display all comments tied to a word under the topic-comments cluster, if the user selects on a word

Create Python templates for Django Quality Requirements Create an intuitive, clean user-interface based on common web standards. There are no permanent set of criteria to define “clean” or “intuitive” since design trends shift from one year to the next. However, the site webchecklist.com provides a decent guide for best practices, usability, code quality, and accessibility . The current dashboard will focus on the four criteria listed above. Other criteria that focus on mobile devices, web analytics, search engine optimization, and social media are out of the scope of this specification.

##Use Case The department chair and the instructor will have the exact use case.

Primary Actors: The department chair needs to upload a file and view topics by comments or comments by instructor distributions.</p>

Precondition:

The department chair must have already logged into the CADET system The department chair has already uploaded a valid file and it has been processed through the Python natural language library The RESTful API has valid requests to obtain data in the database The department chair is using a Chrome or Firefox browser. The data is already processed with the natural processing language API, and is accessible in the RESTful API with URL browser requests. The user has already finished inputting the topic model options for the number of topics.

Postcondition: Graphical representation of the processed data. The user has the ability to switch between different views in the dashboard. If the user clicks on the topics distribution view, the page should display comments across a chosen topic. If the user clicks on the instructor distribution view, the page should load to display sentiment comments directed at the instructor. Variations: The department chair’s primary objective might not be to view instructors and topics, but to have more granular view–such as sorting topics or professors by specific semester. This is out of the scope of the current iteration of this dashboard. Variation Handling: Disclaimer or user-friendly text state the purpose of the current functionality of the dashboard.

Updated