12/7 How many people cheated in 162 and got caught?
\_ A little worried about our 162 grade, eh?
\_ Man, people who get caught cheating just haven't gotten the art
of it down. Should've been practicing and perfecting it in high
school (not that you needed to cheat in HS)
\_ I'm not sure how one would beat a histogram-cheat-detection
scheme. Anyone have any ideas? Other than rearranging
lots of code. And if you do that, why not just do the
project? --PeterM
\_ how does a histogram-cheat-detection scheme work?
\_ I'm not sure, exactly. This is how I imagine it works:
do "wordcount" on all the symbols in the program.
Then you sort the symbols in order of number of
occurrences. Plot a graph of this. Programs
which have just had one symbol substituted for
another will have identical graphs. --PeterM
\_ Hmm, that's easy to break. Just introduce a lot
of redundant local variables with different names
in many routines. Also, rename your local vars
"i" in all routines to "n" for some routine and
"x" for other routines.
\_ Uh huh. What if I look at keywords too?
A histogram of those. Now you have to
rearrange all the control structures too.
This is beginning to sound like a lot of
work: you'd have to do all this rearranging
and substituting and test the result.
--PeterM
\_ just make sure you send the code through
a preprocessor first, to make sure the
keywords aren't being substituted with
#defines
\_ This only breaks one histogram method. There
are probably multiple ones used.
\_ Call graph topology comparison
\_ I suspect that any kind of polynomial time
code comparison program could be defeated
without much trouble. I say this because of various
results in complexity theory about determining
equivalence. For example, determining equivalence of
Turning Machines or Context Free Grammers is
undecidable. Determining equivalence of regular
expressions (or finite automata) is PSPACE-complete.
(Recall that PSPACE is a harder class than NP).
Since programs running on regular computers
(not Turing Machines) are actually finite automata,
if you come up with an algorithm to efficiently compare
programs then you can solve a hell of a lot of other
more important problems. In practice, you can probably
defeat most program comparison tools by writing an
obfuscator program. One possible obfuscation
technique would be to merge and expand functions in a
random manner, and then randomly rename all symbols
including the function names. Of course, cheating
is unethical, but the comparison problem is an
interesting thing to think about. -emin
\_ as mentioned below, one would probably focus
on heuristics that work on typical undergrad-
LOT of results out there, and even if you CAN cheat
a finite-dimensional analysis of infinite-dimensional
spectra, it's extremely difficult to do so if you
don't know which dimensions they are looking at.
While MOSS may not be all that great, it's
presumably non-trivial. Listen to peterm above, he's
basically right. Oh, and another thing -- the class
of all poly-time code comparisons is way too huge and
it should be relatively trivial to prove that not
only is it not "easy" to defeat a random sample of N
poly-time tests if you don't know which ones they are
using, but it's probably complete for BPEXP or
something equally intractable. -alexf
generated, human-readable code. further, it is
likely one would need to convert to an abstract
normal-form that is easily compared to check
the N-way cheating problem efficiently.
\_ What 172 fails to mention is the practical
inapplicability of most of these results. There are a
LOT of heuristics out there, and even if you CAN
cheat a finite-dimensional analysis of
infinite-dimensional spectra, it's extremely
difficult to do so if you don't know which dimensions
they are looking at. While MOSS may not be all that
great, it's presumably non-trivial. Listen to peterm
above, he's basically right. Oh, and another thing --
the class of all poly-time code comparisons is way
too huge and it should be relatively trivial to prove
that not only is it not "easy" to defeat a random
sample of N poly-time tests if you don't know which
ones they are using, but it's probably complete for
BPEXP or something equally intractable. -alexf
\_ What emin means to say is, "The cheat program is a
crock of shit. It's just a slightly improved
version of 'diff' so don't let it bug-a-boo you."
They've been talking about their mythical anti-cheat
program for years but I've turned in tons of out
right stolen code with no changes and got good
grades.
\_ you are a hollow man. You also are lucky:
the methods described above are capable of
catching THAT. Your empty mind will betray
you when you meet real challenges, and
your luck will fail.
\_ Luck? I grabeed shit off the printer and out
the trash, retyped, turned in. A. Did it
plenty of times. I could do the work, I just
didn't feel like wasting my time on it. I
went drinking and got laid. What were you
doing in the lab all night?
\_ Uh huh, and if you'd been caught, you'd
be expelled and getting your
degree from DeVry. Luck. And lazy-ass
readers/Profs.
\_ Unfortunately, the way the
work structure operates, you
can cheat your way through
the cs program at Berkeley
and yet get by in the real
world with at least a $50K
salary or even better. I opted
to get out rather than live
the lie. Some people don't
have those compunctions, but
who knows? Maybe they'll get
what's coming to them, and maybe
they won't.
\_ People have gotten caught cheating
but they didnt get expelled. it's
usually a big empty threat.
\_ depends on what they're caught
cheating on, and how many past
offenses they've had.
\_ Yes, of course the worst
offenders get booted. I did
not say nobody gets expelled
but usually it's a big
empty threat. MOST cheaters
do not even get an F for
it.
the course. This is far
different from being exiled
to DeVry for being caught.
\_ and would this obfuscator be able to produce
human-readable code? if a person spent even a
small amount of time looking at it, wouldn't it be
pretty apparent that the variable and function
names make no sense? (I guess the obfuscator
could use a thesaurus and randomly pick synonyms,
but I doubt that'd produce reasonably
intelligible results either.)
\_ This requires that someone actually look at
the code. Lazy readers and profs permitted the
cheater above to get by. No cheat-detection
method has a chance of working if it is not
applied, the common case, I imagine. -PeterM
\_ obviously, but I meant that machine-obfuscated
code should be pretty easy to recognize, even
if a reader only spends a minute looking at
it. Is a minute too much to ask for?
(rhetorical question; I know the answer is
"yes".) |