;Some assembly required.
;More instructions are in the leading script comments.
This is a two-part script that checks links posted in IRC channels and provides a warning for suspicious links.
The first part checks the domain for strings that are matching or near-matching with phrases that you specify.
The second part checks the page against Google's phishing and malware blacklists. This lookup can be disabled via an alias near the top of the script (instructions in the comments).
/** EXAMPLE
(10:18:23pm) secure.login.fasebook.com
(10:18:23pm) <@Curiosity> [CAUTION]: {match:'facebook',domain:'fasebook.com',trusted:'unknown'}
(10:18:48pm) ianfette.org
(10:18:50pm) <@Curiosity> [WARNING]: This site may contain malware. You can read more about this warning at: http://safebrowsing.clients.google.com/safebrowsing/diagnostic?site=http%3A%2F%2Fianfette%2Eorg%2F [Advisory provided by Google's Safe Browsing Lookup API]
*/
Both parts require Mozilla's public suffix list to be saved to a text file in the save directory. To save the list, type the following into the editbox (without quotes):
" //url -an http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1 | var %file $_phishingDir(mozilla_effective_tld_names.txt) | write %file | run %file | run $nofile(%file) "
The public suffix list is used to correctly parse the base domain name.
SSL is required to connect to Google's API. http://www.mirc.com/ssl.html
//echo -a $sslready
You will need to generate an API key to use Google's service.
Sign up here: https://developers.google.com/safe-browsing/key_signup
Add your key to alias 'api_client_key'.
You should whitelist trusted domains to reduce API lookups and to keep warnings from being displayed for valid matching sites. For instance, if you're checking for 'runescape' with an edit distance range of 0-2, you would want to whitelist 'runescape.com'. 'runescape.wikia.com' would also return a match, but wikia.com can be considered a trusted domain so that would be whitelisted as well. 'services-runescaepee.com' would return a match, but should obviously NOT be whitelisted.
To add or remove sites from the black/white lists, use /trust [-rbw] . (remove/black/white)
/*
* Written by Yawhatnever (Travis) irc.swiftirc.net #mSL
* Other sources are listed within the code.
* Free to use and modify as long as this comment is included with any substantial part of the script.
*/
/***********************************************
* An API key is required to use Google's service.
* Sign up here: https://developers.google.com/safe-browsing/key_signup
* Add your key to alias 'api_client_key'.
************************************************
* Requires SSL.
//echo -a $sslready
* http://www.mirc.com/ssl.html
************************************************
* mozilla_effective_tld_names.txt should be updated occasionally in order to remain accurate.
* Type the following:
//url -an http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1 | var %file $_phishingDir(mozilla_effective_tld_names.txt) | write %file | run %file | run $nofile(%file)
*/
;***************************** BEGIN CUSTOMIZED SETTINGS ********************************
alias -l api_client_key return $null | ;return your API key value instead of $null
alias -l check_mywords {
/*
* Change/add strings to search for and the edit distance allowed for each.
* Be careful of short strings and relatively large edit distances.
*** Your edit distance allowed range for each word must be valid isnum syntax.
* Use a distance of 0 for exact matching.
* (Edit distance = minimum number of insertions, deletions, or replacements)
* [http://en.wikipedia.org/wiki/Levenshtein_distance]
*/
return $checkForWords($1, runescape, 0-2, imageshack, 0-2, fbcdn, 0, facebook, 0-2)
}
alias -l debugChannel {
return $null
/*
* Optionally return a channel name to output API lookups and string matches to. Useful for monitoring matches until you're confident that you've whitelisted the most common trusted domains.
* Ops in this channel may add domains to the trusted list by typing !white example.com (read the comments in the text event below).
*/
}
alias -l logMatches return $true | ;turn logging of lookups on or off
alias -l logLookups return $true
alias -l api_client_lookup_enabled return $true | ;optionally disable usage of Google's API and only use string matching.
alias -l api_client_name return Sojourner
alias -l api_client_version return 0.01
alias _phishingDir {
;If you edit this, be sure the correct files are moved into the directory!
if (!$isdir($qt($mircdirmalware and phishing\))) mkdir $qt($mircdirmalware and phishing\)
return $qt($mircdirmalware and phishing\ $+ $noqt($1-))
}
;If you wish to set specific nicks that are allowed to run this script, de-comment the following line and edit the Mars rover names with your nicks.
;on *:text:$($iif(!$istok(Curiosity Opportunity Spirit Sojourner,$me, 32),*)):#:noop
;***************************** END CUSTOMIZED SETTINGS ********************************
on $*:text:/^[!.]((b)lack|(w)hite|(r)emove) [-a-z\d.]+$/Si:#:{
/*
* Syntax: !black/white/remove example.com
* Whitelist domains (especially for sites that match phrases you check for) to prevent lookups and warnings for trusted domains.
* Blacklisted domains will always show a caution message but will still be looked up on the Google API, which may result in an additional warning if a match is returned.
*/
if (!$debugChannel) || ($nick !isop $debugChannel) return
trust - $+ $regml(2) $2
if ($parseDomain($2)) notice $nick *** $+($upper($left($regml(1), 1)), $right($regml(1), -1), :) $v1
else notice $nick *** Invalid domain.
}
alias trust {
/*
* /trust [-rbw] <example.com>
* The -r switch removes an entry.
* The -b/w switches add a domain to the black or white list, respectively.
* If no switch is given, the domain will be added to the white list.
*/
loadtrusted
if (!$regex(trustedalias, $1, /^-[bwr]$/Si)) tokenize 32 -w $1
var %domain $parseURL($2).domain
if (!%domain) {
if (!$nick) echo -asec info * Error: Invalid domain.
return
}
if ($1 == -r) hdel trusted %domain
elseif ($1 == -b) hadd trusted %domain $false
elseif ($1 == -w) hadd trusted %domain $true
hsave trusted $_phishingDir(trusted.hsh)
if (!$nick) echo -asec info * $gettok(White Black Remove, $findtok(-w -b -r, $1, 32), 32) $+ : %domain
}
alias -l loadtrusted {
if ($hget(trusted)) return
hmake trusted
if ($isfile($_phishingDir(trusted.hsh))) hload trusted $_phishingDir(trusted.hsh)
}
on *:start:{
loadtrusted
noop $parseDomain(example.com)
}
on $*:text:$($catchURLex):#:{
var %urls $extractURL($1-)
var %c 1
while ($gettok(%urls, %c, 32)) {
if (!$trusted($v1)) {
if ($api_client_lookup_enabled) && (!$spamcheck($network, #, $gettok(%urls, %c, 32))) google_malwareapi # $gettok(%urls, %c, 32) | ;if page has not been posted to channel recently, send to lookup alias
if ($matchSummary($gettok(%urls, %c, 32))) var %matches %matches $v1 | ;if an alarm string was matched in the domain, add to list
}
inc %c
}
if (%matches) {
if ($debugChannel) {
!msg $debugChannel # $+ : $1-
!msg $debugChannel %matches
}
if ($logmatches) write $_phishingDir(string_phrase_matches.log) $asctime($gmt) GMT $network # $+(<, $nick, >) $1- //matches: %matches
!msg # [CAUTION]: %matches
}
}
alias google_malwareapi {
/*
* /google_malwareapi #channel http://example.com/path/
* Adds #channel to the list of channels waiting for a response for example.com
* If (example.com is pending or cached) { does /send_googlewarning example.com }
* Else { opens a socket to check example.com }
*/
if (!$api_client_key) {
echo -esac info * Error: Google's API requires an API key. Edit alias 'api_client_key' in the header section of the script. https://developers.google.com/safe-browsing/key_signup
halt
}
hadd -m malwareapi $+($network, :chans:, $2) $addtok($hget(malwareapi, $+($network, :chans:, $2)), $1, 32)
if ($hget(malwareapi, $2)) send_googlewarning $2
else {
var %ticks $ticks
while ($dvar(checksite.,%ticks,.site)) inc %ticks
sockopen -e $+(checksite.,%ticks) sb-ssl.google.com 443
set -e %checksite. $+ %ticks $+ .site $2
hadd -mu60 malwareapi $2 pending
if ($debugChannel) msg $debugChannel api lookup $1 $+ : $2
if ($logLookups) write $_phishingDir(google_api_lookups.log) $asctime($gmt) GMT $network $1 $2
}
}
alias send_googlewarning {
/*
* /send_googlewarning example.com
* Sends a warning to channels listed as waiting for a response from that domain with the type of risk and link to more info.
* Sending the message individually rather than using multi-target messages is intentional. (+B should not stop a url warning)
*/
var %url $1
if ($hget(malwareapi, $+($network, :chans:, %url))) tokenize 32 $v1 $debugChannel
else return
var %result $hget(malwareapi, %url)
if (%result == phishing) var %warning Suspected phishing page:
elseif ($v1 == malware) var %warning This site may contain malware.
elseif ($v1 == phishing,malware) var %warning This site may contain malware or try to steal your information.
if (%warning) msg $* [WARNING]: %warning $iif(%result != phishing, You can read more about this warning at: http://safebrowsing.clients.google.com/safebrowsing/diagnostic?site= $+ $url_encode(%url), $replace(%url,.,(dot))) [Advisory provided by Google's Safe Browsing Lookup API]
if ($hget(malwareapi, %url) != pending) {
if ($logLookups) write $_phishingDir(google_api_lookups.log) %result %url
hdel malwareapi $+($network, :chans:, %url)
}
}
on *:SOCKOPEN:checksite.*:{
var %site $dvar($sockname, .site)
var %form $encodeForm(client, $api_client_name, key, $api_client_key, appver, $api_client_version, pver, 3.1, url, %site)
;save the form in case of error for logs
set -e % $+ $sockname $+ .form %form
sockwrite -nt $sockname GET /safebrowsing/api/lookup? $+ %form HTTP/1.1
sockwrite -nt $sockname Host: sb-ssl.google.com
sockwrite -nt $sockname user-agent: mIRC/ $+ $version
if ($cookies) sockwrite -nt $sockname $v1
sockwrite -nt $sockname Connection: close
sockwrite -nt $sockname $crlf
}
on *:SOCKREAD:checksite.*:{
/*
HTTP/1.1 204 No Content
-not in database
HTTP/1.1 200 OK
-phishing
-malware
-phishing,malware
400
-bad request (incorrect format) (missing parameters, invalid url, improper encoding)
401
-api key not authorized
503
-unavailable (failure or throttled)
*/
var %r,%site $dvar($sockname, .site)
var %form $dvar($sockname, .form)
sockread -f %r
if (!$sock($sockname).mark) {
;header
if (%r == $null) sockmark $sockname 1
elseif (%r == HTTP/1.1 204 No Content) hadd -mu1200 malwareapi %site OK
elseif (%r == HTTP/1.1 200 OK) hadd malwareapi %site receiving
elseif (%r == HTTP/1.1 403 Forbidden) logerror %site 403 Forbidden; Form: %form
elseif (%r == HTTP/1.1 401 Not Authorized) logerror %site 401 Not Authorized; Form: %form
elseif (%r == HTTP/1.1 400 Bad Request) logerror %site 400 Bad Request; Form: %form
elseif (HTTP/1.1 503 * iswm %r) logerror %site $v2 $+ ; Form: %form
cookie_check %r
}
else {
;content body
if ($hget(malwareapi, %site) == receiving) hadd -mu1200 malwareapi %site %r
}
}
on *:sockclose:checksite.*:{
send_googlewarning $dvar($sockname, .site)
unset $+(%, $sockname, .*)
}
alias -l logerror {
write $_phishingDir(google_malware_api_errors.log) $asctime($gmt) GMT $1-
echo -st Logged Google phishing/malware lookup API Error: $1-
}
alias -l spamcheck {
/*
* $spamcheck($network, #, site)
* returns $true if site/# combo has been checked in the last 20 seconds.
*/
var %result $hget(malwareapi, $+($1, $2, $3))
hadd -mu20 malwareapi $+($1, $2, $3) $true
return %result
}
alias -l matchSummary {
/*
* $matchSummary(example.com)
* checks domain for specific strings
* if a match is found, returns {match:'matchString',domain:'example.com',trusted:'[yes/no/unknown]'}
*/
var %checkmatch = $check_mywords($parseURL($1).fulldomain), %trusted = $trusted($1), %color
if (%trusted == $null) %color = 07unknown
elseif (%trusted == $true) %color = 03yes
elseif (%trusted == $false) {
%color = 04no
if (!%checkmatch) %checkmatch = none
}
if (%checkmatch) && ($parseURL($1).domain) return $+({match:',%checkmatch,',$chr(44),domain:',$v1,',$chr(44),trusted:',$chr(3),%color,$chr(3),',$chr(125))
}
alias -l trusted return $hget(trusted, $parseURL($1).domain)
alias checkForWords {
/*
* $checkForWords(string, word1, N1[-N2], word2, N1[-N2], ...,N1[-N2], wordN)
* Checks each group of alphanumeric characters against a list of words. Returns the first word match with a LV distance within range N1-N2
* $checkForWords(runerscape.com, runescape, 0-1) would return runescape
* $checkForWords(imageshaack.us, runescape, 0-1, imageshack, 0-2) would return imageshack
* returns $null if there was no match
*/
;fill backreferences with all alphanumeric strings
noop $regex(checkwords,$1,/([a-z\d]+)/Sig)
;loop through backreferences
var %c 1
while ($regml(checkwords, %c) != $null) {
;loop through match strings being checked
var %word 2, %range 3
while ($eval($+($, %word), 2) != $null) {
;If a match string is within its allowed edit distance from the string being checked, return the match string.
if ($levenshtein($regml(checkwords, %c), $v1) isnum $eval($+($, %range), 2)) return $eval($+($, %word), 2)
inc %word 2
inc %range 2
}
inc %c
}
}
alias tld_list_url return http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1
alias dvar {
/*
* $dvar(foo., bar., %ticks)
* returns value of %foo.bar.46516501
*/
return $eval($+(%, $replace($1-, $chr(32), $null)), 2)
}
alias url_encode {
;source/license:
;https://github.com/david-schor/CodeArchive/blob/master/mSL/net/url_encode.mrc
return $regsubex(urlencode,$1, /([\W\s])/Sg, $iif(\t == $chr(32), +, $+(%, $base($asc(\t), 10, 16, 2))))
}
alias encodeform {
/*
* $encodeform(user, name, pass, pass word)
* returns user=name&pass=pass+word
* leading, trailing, and consecutive spaced are trimmed...
*/
paramToVar $*
return $regsubex(encodeform,$left($str(@=@&, $calc($0 / 2)), -1), /@/g, $url_encode($eval($+(%, param., \n), 2)))
unset %param*
}
alias -l paramToVar {
inc -u %param
set -u %param. $+ %param $1-
}
alias -l _cookieFile return cookies.ini
alias cookie_check {
/*
* /cookie_check <header>
* Checks if a header response sets a cookie.
* TODO: real cookie parsing
*/
if (!$sockname) return
elseif ($regex(cookie,$1-,/^Set-Cookie: ([^=]+)=([^;]+)/i)) {
var %1 $regml(cookie,1), %2 $regml(cookie,2) | ;just to be sure $regml() getting reset later on won't ever cause an issue
writeini $_cookieFile $parseDomain($sock($sockname).addr) %1 %2
}
}
alias cookies {
/*
* $cookies
* returns Cookie: cookie=value; cookie2=value;
* returns $null if no cookies are saved for site
*/
if (!$sockname) || (!$isid) return
var %addr $parseDomain($sock($sockname).addr)
var %c 1
while ($readini(cookies.ini,n,%addr,$ini(cookies.ini,%addr,%c)) != $null) {
var %cookies %cookies $ini(cookies.ini,%addr,%c) $+ = $+ $v1 $+ ;
inc %c
}
if (%cookies) return Cookie: %cookies
}
alias catchURLex return /\b((?:https?://)?)([a-z\d][-a-z\d]*(?:\.[a-z\d][-a-z\d]*)+)((?:/[-a-z\d._~:/!$&()*+=,;%]*)?)((?:\?[-a-z\d._~:/!$&()*+=,;%?]*)?)((?:#[-a-z\d._~:/!$&()*+=,;%]*)?)/Sig
alias extractURL {
/*
* $extractURL(STRING)
* Returns a space-delimited list of all URLs in STRING.
*/
if (!$regex(catchurl,$1-,$catchURLex)) return
;URL is caught into 5 backreferences. Some can be empty. The second is the domain and it's the only non-optional one
var %c 2
while ($regml(catchurl, %c)) {
;only labels with a valid public suffix are added to the url list (aka removing matches like 'o.o')
if ($parseDomain($regml(catchurl, %c))) {
var %protocol $iif($regml(catchurl, $calc(%c - 1)), $lower($v1), http://)
var %domain $lower($regml(catchurl, %c))
var %path $iif($regml(catchurl, $calc(%c + 1)), $v1, /)
var %query $regml(catchurl, $calc(%c + 2))
var %fragment $regml(catchurl, $calc(%c + 3))
var %urls $addtok(%urls,$+(%protocol, %domain, %path, %query, %fragment),32)
}
inc %c 5
}
return %urls
}
alias parseURL {
/*
** $parseURL(example.com)
** Parses a single URL and returns the section specified by $prop.
** If $prop is $null, returns full URL.
** Properties:
* protocol
* fulldomain
* domain
* publicsuffix
* path
* query
* pathquery
* fragment
* TODO: Port, IPv4, IPv6
*/
var %url $extractURL($1)
if (!$prop) || (!%url) return %url
noop $regex(catchurl, %url, $catchURLex)
if ($prop == protocol) return $gettok($regml(catchurl, 1), 1, 58)
elseif ($prop == fulldomain) return $regml(catchurl, 2)
elseif ($prop == domain) return $parseDomain($regml(catchurl, 2))
elseif ($prop == publicsuffix) return $parseDomain($regml(catchurl, 2)).suffix
elseif ($prop == path) return $regml(catchurl, 3)
elseif ($prop == query) return $regml(catchurl, 4)
elseif ($prop == pathquery) return $+($regml(catchurl, 3), $regml(catchurl, 4))
elseif ($prop == fragment) return $regml(catchurl, 5)
}
alias parseDomain {
/*
* returns a single label with a public suffix appended
* $parsedomain(example.com)
* returns $null if .com is not a valid top level domain
* otherwise returns 'example.com'
* $parsedomain(foo.example.co.uk) returns 'example.co.uk'
* $parsedomain(foo.example.co.uk).suffix returns 'co.uk'
*/
;Some top level domains (like .jp) have a lot of rules and thus are expensive to parse. Caching allows $parseDomain() to be used repeatedly on a domain without repeating the expensive parsing.
if ($hget(domaincache,0).item > 20000) hfree domaincache | ;prevent the cached domains table from becoming very large
if (!$hget(domaincache)) hmake domaincache 10000
if (!$hget(domaincache,$1)) hadd domaincache $1 $parseDomainInternal($1).both
if ($prop == suffix) return $gettok($hget(domaincache, $1), 2, 32)
else return $gettok($hget(domaincache, $1), 1, 32)
}
alias -l parseDomainInternal {
if ($numtok($1, 46) < 2) return
if (!$hget(tld)) filltld
var %tld = $gettok($1, -1, 46)
if ($hget(tld, %tld) !isnum) return | ;don't check for $null, domains could end up like '.ps4'...
var %rules = $v1, %c = 0, %level = 1, %result
while (%c <= %rules) {
var %x = 1, %currentrule = $hget(tld, %tld $+ %c), %rulex = /((?<=\.|^) $+ $replace(%currentrule, ., \., !, $null, *, [-a-z\d]+) $+ )$/i
if ($regex(suffix, $1, %rulex)) {
if ($left(%currentrule, 1) == !) { %result = $remove(%currentrule, !) | break }
if ($numtok(%currentrule, 46) > $numtok(%result,46)) %result = $regml(suffix, 1)
}
inc %c
}
if (%result == $1) return | ;example: $parsedomaininternal(co.uk) returns $null
if ($prop == suffix) return %result
var %domain $lower($gettok($1, $calc($numtok($1, 46) - $numtok(%result, 46)) $+ -, 46))
if ($prop == both) return %domain %result
return %domain
}
alias -l _mozillaSuffixList return $_phishingDir(mozilla_effective_tld_names.txt)
alias -l filltld {
;fills tld hash table with mozilla's tld list
if ($hget(tld)) hfree tld
hmake tld 10000
if (!$file($_mozillaSuffixList)) {
.timer 1 0 tldGetDialog
hfree tld
halt
}
else filter -fkg $_mozillaSuffixList addrule ^(?!//)\S
}
alias -l tldGetDialog {
if (%tldgetdialog) return
else set -eu6000 %tldgetdialog $true
beep 1
var %msg $!parseDomain() requires mozilla's public suffix list to be saved to $_mozillaSuffixList in order to function. $crlf $+ The list and notepad.exe should have opened automatically, but if they haven't the list is here: http://goo.gl/ht6EO - When saving from notepad you must select 'File -> Save as' and overwrite the existing mozilla_effective_tld_names.txt with UTF-8 encoding selected.
noop $input(%msg,obv,Setup Required)
.timer 1 1 unset %tldgetdialog
url -an $tld_list_url
write $_mozillaSuffixList
run $_mozillaSuffixList
echo -aesc info * %msg
}
alias -l addrule {
;adds a tld rule to the table
var %tld = $gettok($1, -1, 46)
if ($hget(tld,%tld) == $null) hadd tld %tld 0
else hinc tld %tld
hadd tld %tld $+ $hget(tld, %tld) $1
}
/*
* This is a rewritten version of codemastr's Levenshtein Distance alias. This version fixes a few errors and runs faster.
* In addition to rewriting parts, I've also included more details in the comments.
* Additional information about the function of the alias is available with the original version:
* http://www.mircscripts.org/showdoc.php?type=code&id=2127
* http://www.mircscripts.org/comments.php?cid=2127
*
* Syntax 1: $levenshtein(string1, string2)
* Syntax 2: $levenshtein(string1, string2, insertCost, replaceCost, deleteCost)
* $editdistance() is the same function.
* Case-sensitive versions are $levenshteincs()/$editdistancecs()
*
* Modified by Yawhatnever (Travis) - irc.swiftirc.net #mSL
* Free to use in any script, just attribute the sources above :)
*/
alias editdistance return $levenshteininternal($1,$2,$3,$4,$5,$false)
alias editdistancecs return $levenshteininternal($1,$2,$3,$4,$5,$true)
alias levenshtein return $levenshteininternal($1,$2,$3,$4,$5,$false)
alias levenshteincs return $levenshteininternal($1,$2,$3,$4,$5,$true)
alias -l levenshteininternal {
var %x = $len($1), %y = $len($2), %matrixsize.y = $calc(%y + 1)
if ($5 isnum) var %ins_cost = $3, %rep_cost = $4, %del_cost = $5
else var %ins_cost = 1, %rep_cost = 1, %del_cost = 1
if (!%x) return $calc(%y * %ins_cost)
if (!%y) return $calc(%x * %ins_cost)
hmake lvmatrix
set -u %matrixsize.x $calc(%x + 1)
;fill bottom row with insert cost
var %i, %c = 1, %cost = %ins_cost
while (%c < %matrixsize.x) {
matrixset %c 0 %cost
inc %c
inc %cost %ins_cost
}
;fill left column with delete cost
var %c = 0, %cost = 0
while (%c < %matrixsize.y) {
matrixset 0 %c %cost
inc %c
inc %cost %del_cost
}
%c = 1
while (%c <= %x) {
%i = 1
while (%i <= %y) {
if ($levenshteinequal($mid($1, %c, 1), $mid($2, %i, 1), $6)) %cost = 0
else %cost = %rep_cost
matrixset %c %i $levenshteinmin(%c, %i, %ins_cost, %del_cost, %cost)
inc %i
}
inc %c
}
var %return $matrixget(%x, %y)
;The following line is used for debug purposes.
;var %c %y | while (%c >= 0) { echo -sg $regsubex($left($str(@-,%matrixsize.x),-1),/@/g,$base($matrixget($calc(\n - 1),%c),10,10,2)) | dec %c }
:error
hfree lvmatrix
return %return
}
alias -l levenshteinmin {
/*
* compare(str1, str2)
* the value at point ($len(str1), $len(str2)) on the grid will be the levenshtein/edit distance
* use $matrixget(x, y) to get the value at point (x, y)
*
*{y}
* 2|4|3|2|1|1|
* r|3|2|1|0|1|
* t|2|1|0|1|2|
* s|1|0|1|2|3|
* |0|1|2|3|4|
* |s|t|r|1|{x}
*/
; bottom row is insert cost, left column is delete cost
; $levenshteinmin(x,y,ins_cost,del_cost,rep_cost)
var %left = $calc($matrixget($calc($1 - 1), $2) + $3)
var %below = $calc($matrixget($1, $calc($2 - 1)) + $4)
var %diag = $calc($matrixget($calc($1 - 1), $calc($2 - 1)) + $5)
return $gettok($sorttok(%left %below %diag, 32, n), 1, 32)
}
alias -l matrixset hadd lvmatrix $calc(%matrixsize.x * $2 + $1) $3
alias -l matrixget {
/*
* $matrixget(x coord, y coord)
* returns the value stored at point (x, y)
* bottom left is (0, 0)
* |6|7|8|
* |3|4|5|
* |0|1|2|
* e.g. value of point(2, 2) is stored in the hash table with key "8"
*/
return $hget(lvmatrix, $calc(%matrixsize.x * $2 + $1))
}
alias -l levenshteinequal {
;character 1, character 2, $true = case sensitive/$false = insensitive
if ($1 != $2) return $false
elseif ($1 === $2) return $true
elseif (!$3) return $true
}
I was hoping it would just auto grab all inappropriate websites by using the Google API. Although, I guess I could add it to the Blacklist although that process seems lengthy. I have a mock up script I use right now for banning all links, and I was hoping I could use this one instead, but as I have read through it I realize the lack of Permit commands and what not. Maybe I will look into blending of the two scripts. Anyway, I am looking to create a full Chat Bot script that looks much like Nightbot or Moobot and post it for free online, although I am new to mIRC I am not all that new to Scripting, would you be interested in this project?
Google's API only checks against their phishing and malware blacklists. If you don't allow links that have been deemed inappropriate and you don't want to make the moderators do any work, then a whitelist will always work better than a blacklist.
I wondered if you were planning to use it for twitch when you referenced messages being removed. That's not something normally included with IRC clients.
This script is probably not what you're looking for. It would take a fair amount of work to modify it to allow each channel to control what it does when links are posted. It also only handles "clickable" links, i.e. links that have not been modified to avoid a filter (since the main purpose was to deal with phishing links being posted and people clicking without realizing it was a fake website).
As much as I hate to admit it, mIRC simply can't operate at the scale of Nightbot/moobot. If you want to offer some of the same features with your own twist for a few channels (or make the scripts available so people can run their own version for their channel) that's one thing, but it would be impossible to run a bot for even a few hundred channels using mIRC.
This is what I get http://gyazo.com/0f5f52736560bcdfb7245130e0a44e05 when I put the key here http://gyazo.com/e6d16318266564f4b2d266307bca2cbf
Did you blacklist the domain of the porn site? I didn't really write it with filtering out porn or other "inappropriate" links in mind, but it can function as a blacklist if that's what you want.
Also, unless you've added some commands to ban a user the most it will do is post a warning. With the possibility of false positives or incorrect usage of black/white lists, I felt it was best to leave out banning and let moderators decide how to handle the warnings. You can change the behavior to kick/ban relatively easily if that's what you would prefer.