HP3000-L Archives

January 2006, Week 4

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Reply To:
Date:
Wed, 25 Jan 2006 18:36:50 -0800
Content-Type:
text/plain
Parts/Attachments:
text/plain (106 lines)
Speaking from past experience, they tried that sort of thing on their mail servers and blog sites up at a site in Minnesota where I was contracting on their HP.  The upshot of it was they nearly "burned up" their server trying to "edit" out anything and after 3 separate crashes, simply gave up.  Very expensive, requires tremendous server capacity to enable it correctly.  But if your company has the time and money to burn, go for it.  Lots of luck.
   
  Judy
  

Dave Oksner <[log in to unmask]> wrote:
  And how are you going to tell whether or not the word is contained inside
another word, rendering it "out of context?" Or what about if the word is
hyphenated across lines? Or they use any number of spammer tricks to make
the word a "non-word" that people will recognize as the intended word?

And how are you going to be able to tell the context of something? Is
the word "breast" used with regards to cancer research or porn? Is the
word "porn" used with regards to profanity filters or as a keyword? :-)

I'm sure that this is why, whenever possible, forums and BBSs have a 
disclaimer that all content is owned by the poster and that they're not 
responsible for them. If they're not legally allowed to do that, it's 
unlikely to be allowed at all.

Dave

On Wed, Jan 25, 2006 at 05:51:30PM -0500, John K. wrote:
> I've spent hours searching the web for a solutions, but so far I haven't 
> found one. I have found many papers on how you can't publish a list of bad 
> words without getting in trouble for publishing bad words in the UK (not 
> helpful and of questionable accuracy) and how any list of bad words has to 
> include at a minimum a compendium of bad words from the 22 most frequently 
> spoken languages (but doesn't list the languages or the words). Thus, I'm 
> asking the HP3000-L for input.
> 
> How did I get into this? Since I'm still unemployed, I'm doing a little 
> bit of photography and web site work in addition to job hunting.
> 
> In a web site project I'm working on, I need to "censor" posts that users 
> make. Specifically, I need to remove "obscene, vulgar, offensive, abusive, 
> hateful, harassing, profane, sexually oriented, and threatening" words, 
> replacing each occurrence with the very long phrase "{text deleted by 
> moderator}".
> 
> Obviously, my first question was "Do you have a list of the words you want 
> removed? Of course the answer was "no." (LOL, what was I thinking, asking 
> such a question!)
> 
> Which (of course) led to my second question "but you will provide the list, 
> correct?" Of course the answer to that question was also "no."
> 
> My next question was "Do you have a list of words that people have 
> complained about?" Turned out they did, but that only served to point out 
> another problem - many of the "words" are only offensive when used in a 
> certain context.
> 
> Example.
> "Next, Bob asked Dick about the 69 exception reports. Dick
> replied that all were related to a robotics problem - a
> hydraulic line feeding a robotic arm blew, shutting down
> production."
> became:
> "Next, Bob asked {text deleted by moderator} about the
> {text deleted by moderator} exception reports. {text deleted
> by moderator} replied that all were related to a robotics
> problem - a hydraulic line feeding a robotic arm {text
> deleted by moderator}, shutting down production."
> 
> Okay, so maybe it is easier to get a chuckle out of the "censored" 
> text. Still, the context problem remains.
> 
> Thus, I'm looking for:
> -- a list of bad words
> -- some context sensitive software (that runs on Linux) that:
> -- ALWAYS deletes words on list 1
> -- will ONLY delete words on list 2 if they are in an
> offensive context.
> -- any info on similar "projects" (and their solutions) you are
> aware of.
> 
> Thanks in advance!
> 
> John
> *** When replying to this message, please do not delete these ***
> *** signature lines. Otakon Katsucon HP3000-L @classiccmp.org ***
> *** DigitalCosplay.com JohnKorbPhoto.com JohnPKorb.com ***
> 
> * To join/leave the list, search archives, change list settings, *
> * etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

-- 
+-------David Oksner-----http://www.case.net/--------+
|Witten's Law: |
| Whenever you cut your fingernails, you will find a|
|need for them an hour later. |
[log in to unmask]

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *
  


		
---------------------------------
Do you Yahoo!?
 With a free 1 GB, there's more in store with Yahoo! Mail.

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2