CrowdGrader: CrowdGrader can now check submission similarity

One of the problems of crowd-grading is that there is no-one who grades all submissions. Thus, it is difficult to detect when students submit the same solution.

To help with the detection of similar submissions, we have implemented in CrowdGrader a similarity checker. And not just any similarity checker: the most full-features similarity checker we could wish for.

The feature is accessible from an assignment page by selecting Submissions > Check Similarity.
Note that it is still in Beta - please report any problems.

Input formats

The CrowdGrader similarity checker can process:

Text typed directly in CrowdGrader
Attached Word (docx, not doc), PDF, HTML, RTF documents.
Attached source files in any programming language (it deals correctly with comments in C, C++, Java, Python).
zip, tar, tgz archives. In the archives, you can specify which subset of files should be processed, so the similarity results are meaningful and not drowned out in a myriad of standard files.
Compressed versions of above files via gzip.

CrowdGrader also accepts any nesting of the above, for instance, multiple Word files included in a single zip file are ok.

Comparison output

CrowdGrader distinguishes between text that is:

Unchanged: equal in the two submissions
Renamed: uniformly renamed (for instance, when a variable is renamed)
Different

CrowdGrader also clusters for you the similar submissions, according to a threshold of your choice.

Perhaps the best is to look at a couple of screenshots.

Submissions are clustered according to their similarity.
You can dynamically vary the similarity threshold and explore the resulting clusters.

You can examine similar submissions side-by-side.
Identical content is highlighted in blue; content that has been renamed is highlighted in green.

Sunday, September 20, 2015

CrowdGrader can now check submission similarity

Input formats

Comparison output

No comments:

Post a Comment