Word Count

By vegeto079 on Jul 17, 2009

In just a few words, this script takes a .txt or similar file, reads through it, and tells you how often certain words occur.
This can be used to see how you talk during conversations, or maybe even to see what words to use less for a school project.

To start, type /wordcount (or /wordcheck) in any chatroom or window.

The first thing that pops up asks "What is the filename of the document?"
Here you will type in where to find the .txt or whatever file that you want to count from. This must include C:\Program Files\etc if it is not in the mIRC folder.
For example: "C:\wordcount\convo.txt"
If you type an incorrect file location, leave it blank, or it is a file unreadable by mIRC, it will stop the script.

The next thing is "What is the person's name who you don't want to see wordcount for?"
If you are looking at a conversation between two people and only want to see wordcount for ONE of them, type in the other person's name here.
For example, if I have a friend named "Blahblah" and my name is "Vegeto079", I would type "Blahblah" in this box so that I can see my results.
This can be left empty if it does not apply to you or you want to see the whole document's wordcount.

The next thing is "How many seperating spaces are before the names? (-1 if no names)"
What this means is, how many spaces occur before the final word of the name is shown?
For example, here is an example log:
"12:47 AM Blahblah: Hey, what's up?"
In this occasion, there are 2 spaces before the person's name is shown. If multiple spaces show as in this, only count ones that seperate words:
"12:47 AM Blah blah: Hey, what's up?" (this would be 3)
"12:47 AM B l ah blah: Hey, what's up?" (this would be 5)
If there are no names, put -1
Leaving it blank will be 0

The final thing it asks is "What should be the lowest number before it stops finding occurances?"
This is a very important question, as it will change how long the script takes to finish greatly.
What it means is, if a word only occurs X amount of times, ignore it. If you put this at 10 and a word only occurs 9 times in your given file, it will be ignored.
You do NOT want this at a very low number, as it will take very long to complete.
If left blank, it will be 0.

The script itself does somewhat freeze up mIRC (need to fix, because it stays in a loop for long periods of time) if you try to do anything on it, so when the script starts, don't do ANYTHING on mIRC and you will be able to see the progress of it finishing.
The finished file will be located as "C:\wordCountResults.txt"
Here you will see things like
i 1509
hey 391
sup 34
These are the words and how many times they appeared in the passage, excluding the lines in which you told it not to show.

In-case of an accidental loop getting stuck, it does cancel out after so many tries, the problem is, this means you cannot have HUGE files go through this. Instead, split them into smaller documents.

Sorry for the confusing intro, but once you get to use the script a few times, it's easier to understand.
Everything here is subject to change in-case there is confusion or updates.

Paste the script in remotes and type /wordcheck or /wordcount

alias wordcount {
  window @wordcounttest
  window @wordcount

  .remove C:\wordCountTemp.txt
  echo @wordcount wordCountTemp.txt removed
  .remove C:\wordCountTempTwo.txt
  echo @wordcount wordCountTempTwo.txt removed
  .remove C:\wordCountResults.txt
  echo @wordcount wordCountResults.txt removed
  unset %wc...*
  echo @wordcount Deleted all variables that start with wc...
  unset %wordcount...*
  echo @wordcount Deleted all variables that start with wordcount...

  set %wc...filename $?="What is the filename of the document?"
  echo @wordcount Filename is %wc...filename
  if ($read(%wc...filename,1) == $null) {
    echo @wordcount Error with filename.
    halt
  }
  ;;find file

  set %wc...stop $?="What is the person's name who you don't want to see wordcount for?"
  echo @wordcount Don't save the message if it has %wc...stop
  ;;lets you see what only one person writes

  set %wc...namecount $?="How many seperating spaces are before the names? (-1 if no names)"
  echo @wordcount %wc...namecount spaces before the name
  ;;delete timestamp and name

  set %wc...lowestallowed $?="What should be the lowest number before it stops finding occurances?"
  echo @wordcount Stop when you get to %wc...lowestallowed occurances

  set %wc...a 0
  echo @wordcount Starting...
  while ($read(%wc...filename,$calc(%wc...a + 1)) != $null) { 
    :goto
    inc %wc...a
    set %wc...z $calc(%wc...a / 100)
    ;;if ($isdecimal(%wc...z) == false) echo @wordcounttest %wc...a lines done..
    set %wc...k $round($calc((%wc...a *100) / $lines(%wc...filename)),1)
    if ($isdecimal(%wc...z) == false) echo @wordcounttest %wc...k $+ % done..
    set %wc...line $read(%wc...filename, %wc...a)
    if (%wc...stop isin %wc...line) {
      inc %wc...r
      set %wc...z $calc(%wc...r / 500)
      if ($isdecimal(%wc...z) == false) echo @wordcounttest Took away %wc...r lines
      goto goto
    }
    set %wc...lineEdit $remove(%wc...line,$gettok(%wc...line,1- $+ $calc(%wc...namecount + 1),32))
    set %wc...lineEdit $remove(%wc...lineEdit,?,!,.,$chr(44),",*,\par,$chr(40),$chr(41),[,],_)
    set %wc...totalLines $gettok(%wc...lineEdit, 0, 32)
    set %wc...i 0
    while (%wc...i < %wc...totalLines) {
      inc %wc...i
      set %wc...word $gettok(%wc...lineEdit, %wc...i, 32)
      if (%wordcount... [ $+ [ %wc...word ] ] != $null) {
        inc %wordcount... [ $+ [ %wc...word ] ]
      }
      if (%wordcount... [ $+ [ %wc...word ] ] == $null) {
        write C:\wordCountTemp.txt %wc...word
        set %wordcount... [ $+ [ %wc...word ] ] 1
      }
    }
  }
  write C:\wordCountTemp.txt End-Of-Words
  echo @wordcount wordCountTemp.txt finished
  set %wc...i 0
  while (%wc...word != End-Of-Words) {
    inc %wc...i
    set %wc...word $read(C:\wordCountTemp.txt, %wc...i)
    if (%wordcount... [ $+ [ %wc...word ] ] > %wc...lowestallowed) {
      if (%wc...word != End-Of-Words) write C:\wordCountTempTwo.txt %wc...i $+ : %wc...word %wordcount... [ $+ [ %wc...word ] ]
    }
  }
  echo @wordcount wordCountTempTwo.txt finished
  set %wc...i 0
  set %wc...highest 0
  echo @wordcount Starting second loop...
  set %wc...f
  set %wc...n $lines(C:\wordCountTempTwo.txt)
  while ($read(C:\wordCountTempTwo.txt, 1) != $null) {
    inc %wc...f
    if (%wc...f > 10000) {
      echo @wordcounttest wc...f went above 10000, something went wrong.
      halt
    }
    set %wc...z $calc(%wc...a / 100)
    set %wc...m $round($calc((%wc...f *100) / %wc...n),1)
    if ($isdecimal(%wc...z) == false) echo @wordcounttest %wc...k $+ % done..
    if (%wc...highest != 0) {
      write C:\wordCountResults.txt $gettok($read(C:\wordCountTempTwo.txt, %wc...highestline),2-,32)
      write -dl [ $+ [ %wc...highestline ] ] C:\wordCountTempTwo.txt 
    }
    set %wc...i 0
    set %wc...highest 0
    set %wc...g 0
    while ($read(C:\wordCountTempTwo.txt, $calc(%wc...i + 1)) != $null) {
      inc %wc...g
      if (%wc...g > 10000) {
        echo @wordcounttest wc...g went above 10000, something went wrong.
        halt
      }
      inc %wc...i
      set %wc...z $calc(%wc...i / 200)
      if ($isdecimal($calc(%wc...z)) == false) echo @wordcounttest %wc...i lines done..
      set %wc...word $read(C:\wordCountTempTwo.txt, %wc...i)
      set %wc...countword $gettok(%wc...word,3,32)
      set %wc...numberline %wc...i
      if (%wc...countword > %wc...highest) {
        set %wc...highest %wc...countword
        set %wc...highestline %wc...numberline
      }
    }
  }
  unset %wc...*
  unset %wordcount...*
  .remove C:\wordCountTemp.txt
  .remove C:\wordCountTempTwo.txt
  echo @wordcount Done! File saved at C:\wordCountResults.txt
  echo @wordcount If you want to save your results, please change the name of that file
}
alias wordcheck {
  wordcount
}

Comments

Sign in to comment.
vegeto079   -  Jul 24, 2009

It's wordcount, not wordcheck, sorry about that.
I'll fix what it says in the description and add wordcheck as a command as well.

 Respond  
NightBlade   -  Jul 22, 2009

didnt work for me.

WORDCHECK Unknown command

 Respond  
Are you sure you want to unfollow this person?
Are you sure you want to delete this?
Click "Unsubscribe" to stop receiving notices pertaining to this post.
Click "Subscribe" to resume notices pertaining to this post.