Saturday, June 18, 2022

Day 6/100 Word Count

 Word Count 6/100

Let me count the words, and other things too!


It's been going on nicely for a week. I have been experimenting with different features of the C programming language, and so far no fread() to be seen. I used to do it with fopen() and stuff, but as I discovered new things, I decided to do things the new way. It's fun, and part of learning process.

Today, I decided to do the old wc (word count) program. The program counts more than words. It also counts bytes/characters, lines, and even the maximum length of the line in the file. As this is just a proof of concept, I kept it simple. Interestingly, the most confusing part is the options. It's not that bad, but it's still bad. However, the core of wc is very simple:


int wc() {
  char *buffer=NULL; size_t bufLen=0;
  ssize_t bufSize=0; int i; int state=0;
  int c=0; int l=0; int L=0; int w=0; int s=0;
  while (bufSize=getline(&buffer, &bufLen, stdin)>0) {
    s=strlen(buffer); l+=bufSize; c+=s; L=(L<=s)?s:L;
    for (i=0;i<s;i++)
      if (isspace(buffer[i])) state=0;
      else if (state==0) state=1 && w++;
  }
  printf("%d\t%d\t%d\t%d\n",l,w,c,L);
  free(buffer); return 0;
}


As you can see, it's just reading each line as it happens and count the length of the string, and increment the line counters. As far as counting the words? It just takes 4 lines. I cheated a bit and use the && operator to double up 2 commands without using curly braces. Sneaky! A practice that cannot be recommended if you're just starting, learning to code.

As far as getting the professional version, it can be very verbose, but the core of the program will remain the same. This is just about the only function that matter in a word count program, and learning this will help you write various computer program relatively easily. We're in data, we're not in data. That is the essence of data harvesting in an input line.


#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>


int Debug=0;
int optc=0;
int optl=0;
int optL=0;
int optw=0;


void ShowMsg() {
puts("
Usage: wc [OPTION]...
Print newline, word, and byte counts for each line.
A word is a non-zero-length sequence of characters
delimited by white space. wc reads standard input.


The options below may be used to select which counts are printed, always in
the following order: newline, word, byte, max line length.
  -c, --bytes            print the byte counts
  -l, --lines            print the newline counts
  -L, --max-line-length  print the maximum display width
  -w, --words            print the word counts
      --help     display this help and exit
      --debug    debug info
");
}


int GetOpt(int argc, char *argv[]) {
int i;
  for (i=1;i<argc;i++) {
    if (!strncmp("-c",argv[i],2)) optc=1;
    if (!strncmp("-l",argv[i],2)) optl=1;
    if (!strncmp("-L",argv[i],2)) optL=1;
    if (!strncmp("-w",argv[i],2)) optw=1;
    if (!strncmp("--help",argv[i],6)) {
      ShowMsg(); return 0;
    }
    if (!strncmp("--debug",argv[i],7)) Debug=1;
  }


  if (Debug) {
    puts("Debug...");
  }
  return 0;
}


int wc() {
  char *buffer=NULL;
  size_t bufLen=0;
  ssize_t bufSize=0;
  int i; int state=0;
  int c=0; int l=0; int L=0; int w=0; int s=0;
  if (!c && !l && !L && !w) optc=optl=optw=1;


  while (bufSize=getline(&buffer, &bufLen, stdin)>0) {
    s=strlen(buffer); l+=bufSize; c+=s; L=(L<=s)?s:L;
    for (i=0;i<s;i++)
      if (isspace(buffer[i])) state=0;
      else if (state==0) state=1 && w++;
  }


  if (optl) printf("%d\t",l);
  if (optw) printf("%d\t",w);
  if (optc) printf("%d\t",c);
  if (optL) printf("%d\t",L);
  printf("\n");
  free(buffer);
  return 0;
}



int main (int argc, char *argv[] ) {
  int e=0;
  GetOpt(argc, argv);
  e=wc();
  return e;
}

And that's about it. I probably will redo this program sometime in the future, once I decided to pick up file functions.


No comments:

Post a Comment