-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathcmsc216.tex
3425 lines (2517 loc) · 247 KB
/
cmsc216.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% Adapted from Alex Reustle's CMSC351 Course Notes
% This program is free software: you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation, either version 3 of the License, or
% (at your option) any later version.
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
% You should have received a copy of the GNU General Public License
% along with this program. If not, see <http://www.gnu.org/licenses/>.
\documentclass[english, 10pt]{article}
\usepackage{notes}
\usepackage{inconsolata}
\usepackage[shellescape]{gmp}
\allowdisplaybreaks%
\newcommand{\thiscoursecode}{CMSC 216}
\newcommand{\thiscoursename}{Introduction to Computer Systems}
\newcommand{\thisprof}{Dr.\ Ilchul Yoon}
\newcommand{\me}{Akilesh Praveen}
\newcommand{\thisterm}{Spring 2020}
\newcommand{\website}{http://cs.umd.edu/class/spring2020/cmsc216/}%chktex 8
\usepackage{ifpdf}
\ifpdf%
\DeclareGraphicsRule{*}{mps}{*}{}
\fi
% \listfiles
\usepackage[utf8]{inputenc}
\usepackage{listings}
\usepackage{xcolor}
\usetikzlibrary{patterns}
\definecolor{codegreen}{rgb}{0,0.6,0}
\definecolor{codegray}{rgb}{0.5,0.5,0.5}
\definecolor{codepurple}{rgb}{0.58,0,0.82}
\definecolor{backcolour}{rgb}{0.95,0.95,0.94}
\definecolor{codered}{rgb}{0.5,0.15,0.15}
\definecolor{commentred}{rgb}{1,0.01,0.02}
\lstdefinestyle{mystyle}{
backgroundcolor=\color{backcolour},
commentstyle=\color{codegreen},
keywordstyle=\color{red},
numberstyle=\tiny\color{codegray},
stringstyle=\color{codered},
basicstyle=\ttfamily\footnotesize,
breakatwhitespace=false,
breaklines=true,
captionpos=b,
keepspaces=true,
xleftmargin=.15\textwidth,
xrightmargin=.15\textwidth,
linewidth=\textwidth,
numbers=left,
numbersep=5pt,
showspaces=false,
showstringspaces=false,
showtabs=false,
tabsize=2,
belowskip=3em,
aboveskip=3em,
}
\lstset{style=mystyle}
% \VerbEnvir{align tikzpicture algorithm}
%%%Headers
\chead{216-Introduction to Computer Systems}
\lhead{\thisterm}
%%%%% TITLE %%%%%
\graphicspath{{../}}
\newcommand{\notefront}{%
\pagenumbering{arabic}
\begin{center}
{\small}
\textbf{\Huge{\noun{\thiscoursecode}}}
{\Huge \par}
{\Large{\noun{\thiscoursename}}}\\
\vspace{0.1in}
\vspace{0in}\includegraphics[scale=0.3]{umd_cs.jpg} \\
\vspace{0.1in}{\noun\me} \\
{\noun\thisprof} \ $\bullet$ \ {\noun\thisterm} \ $\bullet$ \ {\noun{University of Maryland}} \\
{\ttfamily \url{\website}} \\
\end{center}
}
\tikzstyle{class}=[
rectangle,
draw=black,
text centered,
anchor=north,
text=black,
text width=2cm,
shading=axis,
bottom color={rgb:red,222;green,222;blue,222},
top color=white,shading angle=45]
\begin{document}
% \renewcommand\familydefault{\sfdefault}
% \sffamily
% Notes front
\notefront%
% Table of Contents and List of Figures
\tocandfigures%
\section{Notes \& Preface}
This is a compilation of my notes for CMSC216 as a TA for the Spring 2020 offering of the course at the University of Maryland. All content covered in these notes was created by Dr. Ilchul Yoon and Dr. A.U. Shankar at the University of Maryland.
\newline
The actual content of this note repository is the content that I cover as a TA during my discussion section, combined with my personal insights for the course. I took this course with Nelson Padua-Perez in the Spring 2019 offering, so some of the notes that I'll drop in here are from my own notes when I took the course in 2019. As such, I would like to attribute certain code examples, analogies, and more to Mr. Perez. I believe that together, these will serve as great \textbf{supplementary material} for CMSC216, but I would still highly recommend attending all of your lecture and discussion sections to achieve success in CMSC216.
\newline
The notes template in use is Alex Reustle's template, which can be found on his github at the following location: \texttt{https://github.com/Areustle/CMSC351SP2016FLN}
\linebreak
I maintain this repository and as such, take responsibility for any mistakes. Please send errors to \texttt{[email protected]}
\section{Week 1 - Introduction to CMSC216}
CMSC216 is where you learn how a computer works on a much lower level than you've experienced before. There are 3 main components that the course will explore.
\subsection{Overview}
\begin{itemize}
\item \textbf{UNIX} Threads, processes, and pipes as the building blocks of much bigger applications. We will be working with the UNIX operating system on the development environment at \texttt{grace.umd.edu}
\item \textbf{C} is a high-performance language that works at a much lower level than Java. Things like memory management and advanced data structures are left up to the user. We'll cover concepts like memory management, pointers, and system calls.
\item \textbf{Assembly} is even lower-level than C, and studying it will reveal how processors process instructions, store data, and maintain a stack and a heap. It's the lowest level you'll go in this class. For this semester's 216, you will be using MIPS assembly.
\end{itemize}
\subsection{Grace}
In this class, we will be using the \texttt{Grace} system to do all of our work. It's a little confusing to understand at first, so here's my way of thinking about it. In CMSC132, we did all of our work on our own computers. We pulled the skeleton code for the projects from the 132 website/repository, edited the code on our computers, and then uploaded our code to the submit server (via Eclipse) in order to test it.\newline
In CMSC216, we have been given access to this big computer that UMD CS owns known as \texttt{Grace}. You, as a student, have been given a small chunk of that machine to call your own (for the semester). In this class, we will access your files on the \texttt{Grace} system using a program known as \texttt{ssh} (that's how MobaXTerm works) and do all of our editing + running code on \texttt{Grace} itself. In fact, we will also be submitting our projects from \texttt{Grace} to the UMD CS submit server.\newline
Here are the relevant links for getting it all set up. You'll need to setup \texttt{Grace} and \texttt{gcc} (the C compiler that we'll be using within \texttt{Grace}).\newline\newline
\begin{itemize}
\item \texttt{\href{http://www.cs.umd.edu/~nelson/classes/resources/GraceSystem.shtml}{http://www.cs.umd.edu/~nelson/classes/resources/GraceSystem.shtml}}
\item \texttt{\href{http://www.cs.umd.edu/~nelson/classes/resources/setting_gcc_alias.shtml}{http://www.cs.umd.edu/~nelson/classes/resources/setting\_gcc\_alias.shtml}}
\end{itemize}
\subsection{Useful UNIX Commands}
Although the UNIX environment may seem confusing at first, learning it is essential to navigating the Grace environment. Below are some of the basic commands that you may find useful when getting started.
\begin{itemize}
\item \textbf{\texttt{ssh}} $\rightarrow$ If you are not using MobaXTerm, you will have to access grace using the \texttt{ssh} command. For the purpose of logging in for CMSC216, I recommend adding the \texttt{-y} flag in order to bypass the warning it will give you. E.g. \texttt{ssh -y [email protected]}
\item \textbf{\texttt{ls}} $\rightarrow$ The \texttt{ls} command lists all the files in your current directory. You can use the \texttt{-l} flag to get more detailed information. E.g. \texttt{ls}, \texttt{ls -l}
\item \textbf{\texttt{cd}} $\rightarrow$ The \texttt{cd} command changes the directory you're currently in, mainly to directories that you can see with \texttt{ls}. Typing \texttt{cd ..} will navigate one directory 'up' from your current directory, and \texttt{cd} without anything else will return you to your home directory. E.g. \texttt{cd 216public}
\item \textbf{\texttt{pwd}} $\rightarrow$ This command displays your current directory. Useful for finding out where exactly you are in the UNIX file hierarchy. E.g. \texttt{pwd}
\item \textbf{\texttt{cp}} $\rightarrow$ Copies files. If you use the \texttt{-r} flag, you're telling the command to recursively copy. If you want to use \texttt{cp} on directories, remember to use that flag.
\item \textbf{\texttt{rm}} $\rightarrow$ This command stands for 'remove'. It can be used to remove singular files, or can alternatively be used with the \texttt{-r} flag to recursively remove directories. E.g. \texttt{rm hello.c}, \texttt{rm -r project1} (project1 would be a folder.
\item \textbf{\texttt{.}}, \textbf{\texttt{..}}, \textbf{\texttt{~}}, and \textbf{\texttt{/}} $\rightarrow$ These abbreviations are pretty important. They can be used to navigate a filesystem in Unix and generate some clever commands. In order, they mean 'current directory', 'parent directory', 'user home folder', and 'root directory'. Below are some examples.
\begin{itemize}
\item \textbf{\texttt{cp *.c ../}} $\rightarrow$ Copies all files that end with \texttt{.c} to the parent directory.
\item \textbf{\texttt{cd /}} $\rightarrow$ Changes directories to the root directory.
\item \textbf{\texttt{cp -r ~/216public/projects/project1 .}} $\rightarrow$ Recursively copies (this means that it copies directories as well as files) the project1 directory and everything in it into the current directory.
\end{itemize}
\end{itemize}
Lots of these UNIX commands are super useful once you get to know them, but it may be hard becoming acquainted with how they work from the outset. It's a far cry from the GUI you had in CMSC132, so here are a few tips.
\begin{itemize}
\item If you're just starting out and still need a graphical representation of the filesystem, I'd highly recommend setting up \textbf{MobaXTerm}. The program provides just a little more graphical representation than just a pure terminal, and allows you to navigate the Grace filesystem more freely. I like to think of it as training wheels as you get acquainted with Grace.
\item I'd highly recommend getting used to making folders, deleting folders, deleting files, and navigating up and down through the filesystem with rapid sequences of \texttt{ls} and \texttt{cd}. As with all things, practice makes perfect, and pretty soon you'll be a command line wizard.
\end{itemize}
\subsection{Machine}
A computer is composed of several parts, but a great way to think about it is a few main components connected by a \textbf{bus}. \newline
\begin{itemize}
\item \textbf{Memory} can just be thought of as a contiguous array of bytes. At the end of the day, this is the stuff that has to be written to/read from.
\item \textbf{I/O Devices} are connected to the CPU via a bus, like mentioned above. By performing read/write operations to the right adaptor, the CPU is able to interface with different I/O devices.
\item \textbf{CPU} is the central processing unit of the computer. It handles computational operations (arithmetic, logic, etc.) and interfaces with the memory and I/O devices via the bus. The CPU is also responsible for performing the \textbf{fetch-execute cycle}.
\end{itemize}
A bus is like one main connector that's responsible for making sure the CPU, memory, and I/O devices are all able to interface with each other.\newline\newline
Note that in this course, we won't be going too in-depth into hardware (that's more Computer Engineering), but it's great background knowledge to have as you approach this class, which is why I have included it here.
\section{Week 2}
\subsection{The Math Library}
We won't be using the math library much in C, but for the times that we do, just remember this one simple flag that we add to the gcc command. As an example, if you try to write some code that includes the math library like below, you'll find that it won't compile with a regular \texttt{gcc} command.
{\centering
\begin{lstlisting}[language=C]
#include <stdio.h>
#include <math.h>
int main() {
double value;
printf("Enter a number: ");
scanf("%lf", &value); /* Notice the use of %lf */
printf("sqrt %f: \n", sqrt(value));
printf("power of 2: %f\n", pow(value, 2));
printf("sin: %f\n", sin(value));
return 0;
}
\end{lstlisting}
}
Remember that the \texttt{-lm} flag essentially enables us to use the math library. In other words, if you want to compile the above file and have it work properly, (let's assume it's called \texttt{math\_example.c}) then you'll want to compile it using the following command.\newline
\texttt{gcc -lm math\_example.c}
\subsection{Using Emacs}
Most of the instruction for this course will be done in \texttt{emacs}, a highly versatile text editor that you can use in GUI form or from the command line. It's always an option to use other text editors in this class, but I would recommend using \texttt{emacs}, as it's what all the in-class demos are in. There is a way to setup IDEs like Visual Studio Code to function with Grace, but I won't cover them here. I believe that although graphical IDEs have their advantages, you'll get plenty of experience with them in CMSC330 and CMSC4XX, so for now, develop your skills in a command line editor like \texttt{emacs} or \texttt{vi}.\newline
For your benefit, here are some basic commands in \texttt{emacs} that I've found useful over the time that I took 216.\newline\newline
\textbf{Note:} When I indicate to type \texttt{M}, that means you need to press the 'meta' key. On most machines, the 'meta' key is the 'alt/option'. When I indicate to type \texttt{C}, I mean the 'control' key. The reason I'm using this notation is because it's the same notation that online guides use to describe \texttt{emacs} shortcuts.
\begin{itemize}
\item \textbf{\texttt{C-x C-s}} $\rightarrow$ Saves the file you're working on. Remember to do this frequently on Grace, as you can't guarantee that your connection to Grace will stay intact.
\item \textbf{\texttt{C-x C-c}} $\rightarrow$ Closes the file that you're working on. If you haven't saved, it will prompt you to save.
\item \textbf{\texttt{C-x u}} $\rightarrow$ Undo the previous command that you ran.
\item \textbf{\texttt{C-s}} $\rightarrow$ Search forwards (this will search for text that'll be ahead of where your cursor is now.)
\item \textbf{\texttt{C-r}} $\rightarrow$ Search backwards (this will search for text that'll be behind where your cursor is now.)
\item \textbf{\texttt{C-l}} $\rightarrow$ This command will center the window around your cursor. A great technique when you have large C files that you're editing.
\item \textbf{\texttt{M-x column-number-mode}} $\rightarrow$ Shows column numbers. Useful if you want to check if you're above the 80 character limit.
\end{itemize}
\subsection{Debugging}
There are three main debugging tools that we use in 216: Valgrind, GDB, and splint. For now, we won't focus too much on Valgrind, as it's more oriented towards helping programmers get rid of memory leaks and other memory-related issues. We will focus on GDB and Splint.
\subsubsection{GDB}
GDB Is the C equivalent of the Eclipse Debugger. It lets you do everything that the Eclipse Debugger allowed you to do in CMSC131 and CMSC132. The only real drawback here is that it's all done from the command line, so the graphic part of the interface is a little lacking. However, it's an essential tool that I'd highly recommend using to figure out errors in your code.\newline
Online references will tell you that there are a lot of commands that you need to know to effectively use GDB, but here are some of the ones that I've found useful.
\begin{itemize}
\item \textbf{\texttt{q}} $\rightarrow$ exits gdb. Useful.
\item \textbf{\texttt{start}} $\rightarrow$ starts running your code with a temporary breakpoint at the first line of main(). This allows you to set more breakpoints before the code actually starts executing.
\item \textbf{\texttt{l}} $\rightarrow$ lists the code that you have.
\item \textbf{\texttt{b}} $\rightarrow$ typing p with a number next to it sets a breakpoint at a line. E.g. \texttt{b 3}
\item \textbf{\texttt{n}} $\rightarrow$ the equivalent of step over in the Eclipse debugger
\item \textbf{\texttt{s}} $\rightarrow$ the equivalent of step into in the Eclipse debugger
\item \textbf{\texttt{c}} $\rightarrow$ will continue running your code until the next breakpoint
\item \textbf{\texttt{p}} $\rightarrow$ will print the value of an expression or a variable. E.g. \texttt{p valid\_character('x')}.
\end{itemize}
In order to start GDB, you'll first need to compile your C code into an \texttt{a.out} file. Not only that, but I would recommend that you compile your code with the \texttt{-ggdb} flag, to ensure that GDB initializes your program correctly. In order to run GDB with your newly compiled program, remember to just type \texttt{gdb a.out}\newline
\section{Week 3}
This week we go over a lot of general C-specific programming concepts in discussion, and that material is heavier than what we usually do in discussion. In that sense, I'll try and go over the more basic stuff that I think will be highly useful as you work on your projects.
\subsection{Comma is an Operator}
The comma in C is an operator. The best way to think about this in use is when you're declaring multiple variables at once, like when you say \texttt{int i, j = 2}.
Remember, commas are \textbf{also} used as separators in C. A great example would be if you're giving a function multiple parameters, like in \texttt{printf("\%d and \%d", i, j)}. When you consider the comma as an operator in C, it's always important to understand where it's an operator vs. where it's a separator.
Although we don't think about the comma operator quite a lot, one of the main reasons for understanding it would be initialization of multiple variables in a loop. Take a look at the following example from my notes (from a previous offering of the CMSC216 course).
{\centering
\begin{lstlisting}[language=C]
// Comma Operator Example by Nelson Padua-Perez
for (j=0, k=10; j<=limit; j++, k+= 10) {
printf("j->%d, k->%d\n", j, k);
}
\end{lstlisting}
}
Notice how you initialize and increment multiple variables within a single for-loop.
\subsection{Identifier Scope}
Scope exists in C in a similar way that it does in other languages. All you have to remember is that if you declare variables within code blocks, they won't be accessible outside those blocks. In that regard, this phenomenon is quite similar to how Java handles scopes.
\subsection{C Program Memory Organization}
As we delve deeper into systems-level programming, it's important to visualize how C actaully manages the memory that your programs use. The interesting part about this is that this diagram is an exact representation of system memory, so you're finally able to see 'under the hood' of your programs.\newline
You can see that the lowest address is represented by \texttt{0x0} and the highest address is represented by \texttt{0xffffffff}. These addresses are actual locations in memory, represented in hexadecimal format (hence ffffffff being the highest address in the representation).
\begin{center}
\tikzset{every picture/.style={line width=0.75pt}} %set default line width to 0.75pt
\begin{tikzpicture}[x=0.75pt,y=0.75pt,yscale=-1,xscale=1]
%uncomment if require: \path (0,300); %set diagram left start at 0, and has height of 300
%Shape: Rectangle [id:dp1982436807499255]
\draw (330,23) -- (400,23) -- (400,265) -- (330,265) -- cycle ;
%Shape: Rectangle [id:dp02798297576071851]
\draw [fill={rgb, 255:red, 184; green, 233; blue, 134 } ,fill opacity=1 ] (332.25,227) -- (397.75,227) -- (397.75,262.5) -- (332.25,262.5) -- cycle ;
%Shape: Rectangle [id:dp8690208046922276]
\draw [fill={rgb, 255:red, 74; green, 144; blue, 226 } ,fill opacity=1 ] (332.25,197.83) -- (397.75,197.83) -- (397.75,223.33) -- (332.25,223.33) -- cycle ;
%Shape: Rectangle [id:dp6529311675516911]
\draw [fill={rgb, 255:red, 245; green, 166; blue, 35 } ,fill opacity=1 ] (331.75,158.83) -- (397.25,158.83) -- (397.25,194.58) -- (331.75,194.58) -- cycle ;
%Up Arrow [id:dp9371264302671377]
\draw [fill={rgb, 255:red, 80; green, 227; blue, 194 } ,fill opacity=1 ] (354,148.03) -- (364.13,137) -- (374.25,148.03) -- (369.19,148.03) -- (369.19,164.58) -- (359.06,164.58) -- (359.06,148.03) -- cycle ;
%Shape: Rectangle [id:dp8515554275840291]
\draw [fill={rgb, 255:red, 208; green, 2; blue, 27 } ,fill opacity=1 ] (332.25,25.5) -- (397.75,25.5) -- (397.75,65.5) -- (332.25,65.5) -- cycle ;
%Down Arrow [id:dp38470823955499234]
\draw [fill={rgb, 255:red, 80; green, 227; blue, 194 } ,fill opacity=1 ] (355,72.1) -- (359.44,72.1) -- (359.44,58) -- (368.31,58) -- (368.31,72.1) -- (372.75,72.1) -- (363.88,81.5) -- cycle ;
% Text Node
\draw (365,211.25) node [align=left] {{\fontfamily{pcr}\selectfont Data}};
% Text Node
\draw (365,244.75) node [align=left] {{\fontfamily{pcr}\selectfont Text}};
% Text Node
\draw (364.5,176.71) node [align=left] {{\fontfamily{pcr}\selectfont Heap}};
% Text Node
\draw (365,45.5) node [align=left] {{\fontfamily{pcr}\selectfont Stack}};
% Text Node
\draw (299,24) node [align=left] {0xffffffff};
% Text Node
\draw (301,262) node [align=left] {0x0};
\end{tikzpicture}
\end{center}
Your program is allocated a certain block of memory- within it are the following 4 components. Keep in mind that this too, is an abstraction. You can further explore how programs are represented in memory in classes like CMSC411, but this is just about as far as we'll go in 216.
\begin{itemize}
\item \textbf{Text} is where the code for your program goes. It's really not much more complex than that.
\item \textbf{Data} is where global variables and variables that are static belong.
\item \textbf{Heap} is where dynamically allocated memory lives. In Java, this stuff was managed for you. In C, you will have to manage it yourself, allocating memory and effectively increasing the size of the heap if you need more space while your program is running, and deallocating (freeing) memory to decrease the size of the heap. More on this when we discuss dynamically allocated memory.
\item \textbf{Stack} is where local variables and function parameters live. It grows downwards (eventually meeting the heap and causing a stackoverflow) as functions are called. If you'll think back to 'stack frames' from recursion in CMSC132, this is the exact same concept.
\end{itemize}
\subsection{Storage}
There are two types of ways variables are stored in C- automatic and static. This basically goes hand-in-hand with block scopes and file scopes, but the important takeaways are these. First of all, in the example below, after the function \texttt{foo} is called, the variable \texttt{n} is thrown away.
{\centering
\begin{lstlisting}[language=C]
int foo(int k) {
int n = 216;
return n;
}
\end{lstlisting}
}
In that regard, the variable \texttt{n} has automatic storage. If a variable has static storage, it basically exists throughout the duration of your program's running time. Such variables are initialized only once.\newline
An important note: \texttt{static} in C does not mean the same thing as it does in Java. Here are the two main things that I think are worth remembering about static variables in C:
\begin{itemize}
\item Static variables need not necessarily be initialized. If you don't bother initializing a static variable (you still have to declare it- this is not Python, language of the heathen) it will automatically initialize to zero.
\item Static variables retain their values between function invocations. In other words, they are not stored using automatic storage.
\end{itemize}
{\centering
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
void compute_static(int x) {
static int value = 100; /* What would happen if we don't initialize it? */
printf("(static) x: %d, value: %d, sum: %d \n", x, value, value + x);
++value;
}
\end{lstlisting}
}
In the example above, if you called \texttt{compute\_static} twice, then your output would be \texttt{(static) x: 1, value: 100, sum: 101} and \texttt{(static) x: 1, value: 101, sum: 102}, as 'value' would retain its data between function calls.
\subsection{Linkage}
Linkage is essentially the science behind having C code spread across multiple files and making sure it all compiles and works properly.\newline
We want to sometimes split code between multiple files for organizational purposes. Currently, the projects you're working on are small, but in order to make your programs versatile, modular and better organized, it's a great idea to split code between files.\newline
When you attempt this, there may be issues that follow. For example, you may encounter a situation where you want to name a function \texttt{print\_sum()} in two files. How would we deal with such a duplicate?\newline
Problems of this sort can be solved by adjusting the \textbf{linkage} of these functions.\newline
For actual code examples, please check the linkage-examples in the 216public directory. They're extremely thorough. My goal here is to provide a quick few tips on what I think are the most important parts.\newline
Essentially, there are three types of linkage that you should remember to guide you through writing code in multiple C files.
\begin{itemize}
\item \textbf{\texttt{None}} $\rightarrow$ No linkage. This is how you usually declare your variables, and as you'd expect, doesn't do anything special in regards to linkage. Think of it this way: A variable with no linkage belongs to a single function, and cannot be shared. In other words, there is \textit{only one copy per declaration}.
\item \textbf{\texttt{Internal}} $\rightarrow$ Internal linkage is just a fancy way of saying you're using the \textbf{\texttt{static}} keyword. All declarations of a single identifier in file refer to the same thing. In other words, there is \textit{only one copy per file}.
\item \textbf{\texttt{External}} $\rightarrow$ External linkage is signified by the \texttt{extern} identifier, and it basically means that a name can only refer to a single entity in your entire program. In other words, there is \textit{only one copy per program}.
\end{itemize}
\subsection{Enumerated Types}
Enumerated Types, or enums, in C are pretty useful, and quite comparable to their equivalents in Java. The best way to understand enums (in my opinion) is to think of examples. Some good ones are an enum for the days of the week (Monday, Tuesday, etc.), seasons (Summer, Spring, Fall, Winter), or even suits in a deck of cards (Spades, Clubs, Hearts, Diamonds). Below is an example of the latter.
{\centering
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
#include <stdio.h>
int main() {
enum Suit {SPADES, HEARTS,DIAMONDS = 42, CLUBS};
enum Suit suit1, suit2;
suit1 = SPADES;
suit2 = CLUBS;
if (suit1 < suit2) printf("Spades are first.\n");
else printf("Clubs are first.\n");
printf("Spades = %d, Clubs = %d\n",suit1, suit2);
return 0;
}
\end{lstlisting}
}
The functionality here is pretty basic, but one thing that I think is worth remembering (and quite nifty if you can use it well) is that enum representations are based in integers. This means that, for example, you can get away with adding the month enum for January (0) and the month enum for February (1) and end up with February (1).\newline
Again, the code above is a great example of how you can leverage the integer-like characteristics of enums.
\subsection{Implicit Type Conversion and Casting in C}
Switching between data types is pretty similar to how it was in Java, but here's a quick review of the stuff that matters. As you write your projects, you'll realize these things, but it's important to remember when it's a good idea to cast and when it isn't. Here are some general tips for you.\newline
\begin{itemize}
\item \texttt There are a few ways to represent numbers in C. For relatively small numbers, \texttt{short}s are the way to go. If you want to represent a number that's a little bigger, use an \texttt{int}. The difference between these two on the systems that we'll be working with is that \texttt{int}s are twice the size of \texttt{short}s. (4 bytes vs. 2 bytes) If you want to represent a decimal number, You'll probably just want to use a \texttt{float}.
\item In my opinion, most projects that we'll deal with here can be accomplished perfectly well with just \texttt{int}s and \texttt{float}s.
\item We can also cast in C, and it works almost the same was as it did in Java. Just remember, in Java we had the concept of wrapper classes that allowed us to do fancy things with certain data types. In C, we don't enjoy that luxury, so we are restricted to just basic data type casting. Below is an example.
{\centering
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
#include <stdio.h>
int main() {
float x = 2.98;
int y = (int)x;
}
\end{lstlisting}
}
\item That works exactly as you think it does. It converts 2.98 to 2 as it would in Java. Remember, don't overthink it, and don't try to call any wrapper class methods that you remember from Java. As long as you keep that in mind, you should be good to convert between data types in C.
\end{itemize}
{\centering
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
#include <stdio.h>
int main() {
int x = 2000000000;
long result_long;
printf("Value of x: %d\n", x);
printf("Multiplying by 3 (with %%d format): %d\n", 2000000000 * 3);
printf("Multiplying by 3 (with %%ld format): %ld\n", 2000000000 * 3);
printf("Multiplying by 3L (with %%d format): %d\n", 2000000000 * 3L);
printf("Multiplying by 3L (with %%ld format): %ld\n", 2000000000 * 3L);
result_long = 2000000000 * 3; /* Does it solve the problem? */
printf("Storing result in long type variable: %ld\n", result_long);
return 0;
}
\end{lstlisting}
}
The above example from Nelson isn't that basic, but I feel like it gives you a good insight into how type conversion can find use. Give that example a try to see a cool application of using multiple data types to handle larger values.
\section{Week 4}
This week we cover pointers, a few functions in C that you may find useful, and GDB in emacs. The main focus of these notes will be pointers, and chances are that you've seen a lot of this stuff in lecture as well. Make sure to take some time to try out the examples that we've got for you so you understand the basics of how pointers work, as they're a fundamental part of C.
\subsection{Pointers \& Memory Maps}
Let's go over pointers in C. You may have already covered this subject in lecture, but I'd like to point out some of the nuances that helped me understand pointers when I was taking 216. \newline
First of all, take note that pointers are just another type of variable. Just like you have \texttt{int}s and \texttt{char}s in C, which take up a certain amount of space and store a certain type of data, a \textbf{pointer} is a data type that stores an \textbf{address}. \newline
There are a bunch of ways to think of pointers, but I think the easiest way to understand them is to use memory maps. Think of them as as a tool to help us better understand how pointers work- they are essentially just visual representations of memory in C. \newline
I think that pointers and memory maps go hand in hand in 216, so I'll include some examples (some of my own, plus the examples we go over in discussion) that I think will help you become proficient with both pointers and memory maps. \newline
As a side note, you can take a look at Nelson's sample memory map online if you need some extra guidance. (This should have been covered in discussion). \newline
\texttt{\href{http://www.cs.umd.edu/~nelson/classes/resources/MemoryMapExample.pdf}{http://www.cs.umd.edu/~nelson/classes/resources/MemoryMapExample.pdf}}
\subsection{Example - Integer \& Integer Pointer\newline}
{
\centering
\tikzset{every picture/.style={line width=0.75pt}} %set default line width to 0.75pt
\begin{tikzpicture}[x=0.75pt,y=0.75pt,yscale=-1,xscale=1]
%uncomment if require: \path (0,300); %set diagram left start at 0, and has height of 300
%Shape: Rectangle [id:dp058944761853589434]
\draw (369,30) -- (420,30) -- (420,59.14) -- (369,59.14) -- cycle ;
%Shape: Rectangle [id:dp8289049315294679]
\draw (370,103.33) -- (421,103.33) -- (421,132.48) -- (370,132.48) -- cycle ;
%Straight Lines [id:da15432785379816327]
\draw (395.5,117.9) -- (394.37,62) ;
\draw [shift={(394.33,60)}, rotate = 448.85] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
% Text Node
\draw (330,44) node [align=left] {{\fontfamily{pcr}\selectfont int a}};
% Text Node
\draw (394.5,44.57) node [align=left] {5};
% Text Node
\draw (325.67,116.33) node [align=left] {{\fontfamily{pcr}\selectfont int * b}};
\end{tikzpicture}
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
#include <stdio.h>
int main() {
int a = 5;
int * b = &a;
return 1;
}
\end{lstlisting}
}
This is about as simple as we can get with pointers. There are a variety of types of pointers that exist (one for each data type in C), but just remember that they're essentially just variables that store addresses.\newline
In this example, we can see that \texttt{a} is an integer, and \texttt{b} is an integer pointer. Although I've drawn an arrow from the inside of \texttt{b}'s box to \texttt{a}'s box, don't let that confuse you. \newline
Think of it like this- \texttt{a} \textbf{contains} the integer value 5. \texttt{b} \textbf{contains} the address of \texttt{a}. By convention in C, we say that \texttt{b} points to \texttt{a}. We just show this by drawing an arrow that starts in \texttt{b}'s box and points to \texttt{a}.
\subsection{Example - Multiple Pointer Types\newline}
{
\centering
\tikzset{every picture/.style={line width=0.75pt}} %set default line width to 0.75pt
\begin{tikzpicture}[x=0.75pt,y=0.75pt,yscale=-1,xscale=1]
%uncomment if require: \path (0,300); %set diagram left start at 0, and has height of 300
%Shape: Rectangle [id:dp058944761853589434]
\draw (353,164) -- (404,164) -- (404,193.14) -- (353,193.14) -- cycle ;
%Shape: Rectangle [id:dp8289049315294679]
\draw (354,237.33) -- (405,237.33) -- (405,266.48) -- (354,266.48) -- cycle ;
%Straight Lines [id:da15432785379816327]
\draw (379.5,251.9) -- (378.37,196) ;
\draw [shift={(378.33,194)}, rotate = 448.85] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
%Shape: Rectangle [id:dp0020066675833932957]
\draw (223.67,28.67) -- (274.67,28.67) -- (274.67,57.81) -- (223.67,57.81) -- cycle ;
%Shape: Rectangle [id:dp16429220233511355]
\draw (224.67,102) -- (275.67,102) -- (275.67,131.14) -- (224.67,131.14) -- cycle ;
%Straight Lines [id:da051837340435018975]
\draw (250.17,116.57) -- (249.04,60.67) ;
\draw [shift={(249,58.67)}, rotate = 448.85] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
%Shape: Rectangle [id:dp1385770633532739]
\draw (465.67,28.67) -- (516.67,28.67) -- (516.67,57.81) -- (465.67,57.81) -- cycle ;
%Shape: Rectangle [id:dp03270745379570461]
\draw (467.33,105.33) -- (518.33,105.33) -- (518.33,134.48) -- (467.33,134.48) -- cycle ;
%Straight Lines [id:da3069083126116453]
\draw (492.17,116.57) -- (491.04,60.67) ;
\draw [shift={(491,58.67)}, rotate = 448.85] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
% Text Node
\draw (284.67,177.33) node [align=left] {{\fontfamily{pcr}\selectfont double my\_double}};
% Text Node
\draw (378.5,178.57) node [align=left] {9.0};
% Text Node
\draw (272.33,251.67) node [align=left] {{\fontfamily{pcr}\selectfont double * double\_ptr}};
% Text Node
\draw (159.33,44) node [align=left] {{\fontfamily{pcr}\selectfont int my\_integer}};
% Text Node
\draw (249.17,43.24) node [align=left] {6};
% Text Node
\draw (151,113.67) node [align=left] {{\fontfamily{pcr}\selectfont int * integer\_ptr}};
% Text Node
\draw (399.33,44) node [align=left] {{\fontfamily{pcr}\selectfont char my\_char}};
% Text Node
\draw (491.17,43.24) node [align=left] {e};
% Text Node
\draw (395,118.33) node [align=left] {{\fontfamily{pcr}\selectfont char * char\_ptr}};
\end{tikzpicture}
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
#include <stdio.h>
int main() {
int my\_integer = 6;
double my\_double = 9.0;
char my\_char = 'e';
int * int\_ptr = &my\_integer;
double * double\_ptr = &my\_double;
char * char\_ptr = &my\_char;
}
\end{lstlisting}
}
Here's a similar case to up above, but I just wanted to demonstrate that there are different types of pointers. Now, keep in mind that all of these pointers essentially hold addresses, and it's not like the address of a double looks much different from the address of a character or the address of an integer.\newline
If you're wondering why C is so specific and asks you to define the type of pointer, the answer lies in how we will treat the data that's within the pointer. Sure, it may be that all pointers hold addresses, but what happens if we try to add the contents of \texttt{double\_ptr} and \texttt{integer\_ptr}? If C only had one pointer type and we tried to add the contents of those two pointers together, there would be no way of knowing that we made a mistake until runtime. In that sense, C maintains different types of pointers to ensure type compatibility. The same address could be given by the C memory manager to an integer pointer or a double pointer, but in order to make sure that you're treating whatever is stored at that address in a type-compatible way, C makes sure to note the type of what you're pointing to.
\subsection{Example - Pointer To a String (Char Array) \newline}
{
\centering
\tikzset{every picture/.style={line width=0.75pt}} %set default line width to 0.75pt
\begin{tikzpicture}[x=0.75pt,y=0.75pt,yscale=-1,xscale=1]
%uncomment if require: \path (0,300); %set diagram left start at 0, and has height of 300
%Shape: Rectangle [id:dp16429220233511355]
\draw (245.67,207) -- (296.67,207) -- (296.67,236.14) -- (245.67,236.14) -- cycle ;
%Straight Lines [id:da051837340435018975]
\draw (271.17,221.57) -- (271.98,148.5) ;
\draw [shift={(272,146.5)}, rotate = 450.64] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
%Shape: Rectangle [id:dp8412783566407918]
\draw (243.67,117) -- (294.67,117) -- (294.67,146.14) -- (243.67,146.14) -- cycle ;
%Shape: Rectangle [id:dp7327342661859714]
\draw (294.67,117) -- (345.67,117) -- (345.67,146.14) -- (294.67,146.14) -- cycle ;
%Shape: Rectangle [id:dp03554725186896712]
\draw (345.67,117) -- (396.67,117) -- (396.67,146.14) -- (345.67,146.14) -- cycle ;
%Shape: Rectangle [id:dp5893017620753757]
\draw (396.67,117) -- (447.67,117) -- (447.67,146.14) -- (396.67,146.14) -- cycle ;
%Shape: Rectangle [id:dp8682717330515748]
\draw (447.67,117) -- (498.67,117) -- (498.67,146.14) -- (447.67,146.14) -- cycle ;
% Text Node
\draw (169,217.67) node [align=left] {{\fontfamily{pcr}\selectfont char[5] my\_string}};
% Text Node
\draw (269.17,131.57) node [align=left] {\textbackslash 0};
% Text Node
\draw (320.17,131.57) node [align=left] {\textbackslash 0};
% Text Node
\draw (371.17,131.57) node [align=left] {\textbackslash 0};
% Text Node
\draw (422.17,131.57) node [align=left] {\textbackslash 0};
% Text Node
\draw (473.17,131.57) node [align=left] {\textbackslash 0};
\end{tikzpicture}
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
#include <stdio.h>
int main() {
char my_string[5];
}
\end{lstlisting}
}
Finally, here's a look at how we would store a string. I picked a string because it's essentially an array of characters, so we get to see how both are represented in memory maps. \newline
Here, don't let the notation confuse you. Although I've declared the string \texttt{my\_string} in special notation, it's still essentially a pointer to a character. In this case, \texttt{my\_string} is a pointer to the first of 5 characters that C has allocated as \texttt{NULL} for us. I've taken the liberty to fill the allocated blocks in as null bytes.
\subsection{Lab Examples}
I'll also go over the examples that we went over in lab, but a little less in-depth, as they're usually a bunch of concepts put together. We'll focus on what I think are the important portions of each example.
\subsection{Example from Lab - ptr\_review.c}
Here, we'll talk a little bit about\texttt{ptr\_review.c} \newline (This file can be found at \texttt{~/216public/labs/Week4/lab1}) \newline
This is just going over the basics of pointers, and it has a few functions that demonstrate a few things, but I'd just like to go over a few of the questions posed in the actual file.
{
\centering
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
int main(void) { /* notice use of void in main */
float *p, *m; /* have garbage value */
float pressure; /* has garbage value */
int area = 10;
int a[3] = {777, 888}; /* missing value? */
p = &pressure; /* & returns address */
m = p; /* both m and p point to the same entity */
printf("Value1 %.2f\n", *m); /* are we ever getting a segmentation fault?*/
...
return 0;
}
\end{lstlisting}
}
\begin{itemize}
\item Using the keyword \texttt{void} in main essentially means that your program will be taking no arguments. That's the long and short of it.
\item When we define \texttt{p} and \texttt{m} as pointers and don't assign anything to them, they essentially contain garbage values. If you want a visual representation of that, just imagine two pointer variables with arrows pointing into the unknown. We don't know what they're pointing to, nor do we want to find out.
\item It's the same deal if we define a float without assigning it a value- it contains a garbage value.
\item When they set \texttt{m} equal to \texttt{p}, they're making it so both pointers are pointing to the same variable. If that confuses you, think of it the other way- pointers contain addresses, and it just so happens that after executing \texttt{m = p;}, both \texttt{m} and \texttt{p} contain the same addresses.
\end{itemize}
{
\centering
\tikzset{every picture/.style={line width=0.75pt}} %set default line width to 0.75pt
\begin{tikzpicture}[x=0.75pt,y=0.75pt,yscale=-1,xscale=1]
%uncomment if require: \path (0,300); %set diagram left start at 0, and has height of 300
%Shape: Rectangle [id:dp16429220233511355]
\draw (245.67,207) -- (296.67,207) -- (296.67,236.14) -- (245.67,236.14) -- cycle ;
%Straight Lines [id:da051837340435018975]
\draw (271.17,221.57) -- (318.07,132.27) ;
\draw [shift={(319,130.5)}, rotate = 477.71] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
%Shape: Rectangle [id:dp5176612453446056]
\draw (352.67,207) -- (403.67,207) -- (403.67,236.14) -- (352.67,236.14) -- cycle ;
%Straight Lines [id:da9913485463278986]
\draw (378.17,221.57) -- (320.09,132.18) ;
\draw [shift={(319,130.5)}, rotate = 416.99] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
%Shape: Rectangle [id:dp7874256794735744]
\draw (292.67,100) -- (343.67,100) -- (343.67,129.14) -- (292.67,129.14) -- cycle ;
% Text Node
\draw (232,218.67) node [align=left] {{\fontfamily{pcr}\selectfont m}};
% Text Node
\draw (337,219.67) node [align=left] {{\fontfamily{pcr}\selectfont p}};
% Text Node
\draw (250,112.67) node [align=left] {{\fontfamily{pcr}\selectfont pressure}};
\end{tikzpicture}
}
\begin{itemize}
\item Finally, when it asks if we are ever getting a segfault, the short answer is \textbf{maybe}. In C, dereferencing a pointer that we have not yet initialized is considered \textbf{undefined behavior}. It could provide us with garbage data, give us a segfault because we tried to access corrupted data, or give us a segfault because we tried to access data locked off by the system. We don't really know what will happen in this case, so we're calling it undefined behavior. In grace, variables that aren't initialized are given a value of 0 or NULL, so we won't see this effect here. However, running in any other C environment will yield undefined behavior.
\end{itemize}
\subsection{Example from Lab - ptr\_add\_sub\_overview.c}
Here, we'll talk a little bit about \texttt{ptr\_add\_sub\_overview.c} \newline (This file can be found at\texttt{~/216public/labs/Week4/lab1}) \newline
This example is all about pointer arithmetic, and it relies on the fact that you understand that arrays are stored in contiguous memory. Let's think about the following example. If you had an array that was represented in C memory like this:\newline\newline
{
\centering
\tikzset{every picture/.style={line width=0.75pt}} %set default line width to 0.75pt
\begin{tikzpicture}[x=0.75pt,y=0.75pt,yscale=-1,xscale=1]
%uncomment if require: \path (0,300); %set diagram left start at 0, and has height of 300
%Shape: Rectangle [id:dp5176612453446056]
\draw (228.67,212) -- (279.67,212) -- (279.67,241.14) -- (228.67,241.14) -- cycle ;
%Straight Lines [id:da9913485463278986]
\draw (254.17,226.57) -- (258.9,132.5) ;
\draw [shift={(259,130.5)}, rotate = 452.88] [color={rgb, 255:red, 0; green, 0; blue, 0 } ][line width=0.75] (10.93,-3.29) .. controls (6.95,-1.4) and (3.31,-0.3) .. (0,0) .. controls (3.31,0.3) and (6.95,1.4) .. (10.93,3.29) ;
%Shape: Rectangle [id:dp7874256794735744]
\draw (234.67,99) -- (285.67,99) -- (285.67,128.14) -- (234.67,128.14) -- cycle ;
%Shape: Rectangle [id:dp04481985338500627]
\draw (285.67,99) -- (336.67,99) -- (336.67,128.14) -- (285.67,128.14) -- cycle ;
%Shape: Rectangle [id:dp7253813806088456]
\draw (336.67,99) -- (387.67,99) -- (387.67,128.14) -- (336.67,128.14) -- cycle ;
%Shape: Rectangle [id:dp7617124294428573]
\draw (387.67,99) -- (438.67,99) -- (438.67,128.14) -- (387.67,128.14) -- cycle ;
%Shape: Rectangle [id:dp12383450763178705]
\draw (438.67,99) -- (489.67,99) -- (489.67,128.14) -- (438.67,128.14) -- cycle ;
% Text Node
\draw (213,224.67) node [align=left] {{\fontfamily{pcr}\selectfont p}};
% Text Node
\draw (260.17,113.57) node [align=left] {1};
% Text Node
\draw (311.17,113.57) node [align=left] {6};
% Text Node
\draw (362.17,113.57) node [align=left] {6};
% Text Node
\draw (413.17,113.57) node [align=left] {1};
% Text Node
\draw (464.17,113.57) node [align=left] {2};
\end{tikzpicture}
}
In this case, since arrays are stored in contiguous memory, so essentially what we are claiming with pointer arithmetic is that, if we dereference \texttt{p} now, we will get the number 1. If we \textbf{add} 1 to p (the actual pointer) and then dereference it, we will get the number 6. The file explores similar examples. Here are some highlights.
\begin{itemize}
\item Just like we discussed earlier, here's an application of simple pointer addition. As a reminder you can add numbers other than 1.
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
char name[MAX] = "The House is Blue";
char *p = name, *q;
int i;
/* You can add and subtract integer values from pointers. */
/* For example, if you add one to a pointer to a character */
/* array, the pointer will now be referring to the next */
/* character. You can add any integer value (not just one) */
/* Printing the string using pointer arithmetic */
while (*p != '\0') {
printf("%c", *p);
p = p + 1;
}
}
\end{lstlisting}
\item You can also take advantage of the fact that arrays are stored in contiguous memory by subtracting pointers to find 'distance' between them. Note that this only works with pointers of the same type.
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
/* You can tell how many elements are between two pointers */
/* by subtracting pointers */
p = name + 1;
q = &name[5];
printf("Elements #1: %ld\n", q - p);
printf("Elements #2: %ld\n", p - q);
\end{lstlisting}
\item Finally, you can leverage pointer arithmetic to help you index arrays as well. Here's an example of that below.
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
/* Indexing is a pointer operation */
printf("Indexing as pointer operation\n");
p = name;
for (i = 0; i < strlen(name); i++) {
printf("%c\n", p[i]);
}
\end{lstlisting}
\end{itemize}
\subsection{Example from Lab - str\_review.c}
Here, we'll talk a little bit about\texttt{str\_review.c} \newline (This file can be found at \texttt{~/216public/labs/Week4/lab1}) \newline
This example is pretty light compared to the rest- and it is just a review of how strings are stored in C. The main overarching concept you need to understand here is two things:
\begin{itemize}
\item Strings are not given an actual data type in C. They are simply arrays of characters with a small caveat.
\item That being said, strings are always stored in a certain way. They are a character array terminated with a null byte. (No null byte at the end means you don't have a string- you have a regular old character array)
\end{itemize}
Take a look at my String example above for the memory map representation.
\subsection{Using getchar() and putchar()}
The two functions \texttt{getchar} and \texttt{putchar} are pretty curious, in that we have much more functional replacements for them- \texttt{scanf} and \texttt{printf}, respectively. However, learning these is a cool way to prep yourself for how basic I/O in assembly works, so I think that it's worth it to at least gloss over these for now. \newline
Let's look over the code provided for us in discussion and touch on the main points.
\begin{lstlisting}[language=C]
// example from Nelson Padua-Perez
#include <stdio.h>
#define MAX_LEN 80
int main() {
char value[MAX_LEN + 1];
int letter; /* Why integer? */
printf("Enter a letter: ");
scanf("%1s", value);
printf("Value entered: \"%s\"\n", value);
getchar(); /* getchar() reads a single character; why we need it? */
printf("Enter a letter: ");
letter = getchar();
printf("Letter entered: ");
putchar(letter); /* putchar() prints a single character */
printf("\n");
/* try ungetc to put characters back */
return 0;
}
\end{lstlisting}
\begin{itemize}
\item First of all, both \texttt{getchar} and \texttt{putchar} deal with integers, despite the fact that they are meant to take in/print characters. Don't let this confuse you, they're simply storing them by the ASCII value.
\item Both of these get and print a single character, and in my opinion, there's no real reason to need them except in very special cases, but this is how I/O will be conducted in Assembly, so I think it's worth taking a look at this now.
\item Your main takeaway from this should be that \texttt{getchar} and \texttt{putchar} are functions that we can use to do I/O in C, and even though they're a little more crude than we'd like for most applications, they still exist, and are helpful tools when we're trying to understand Assembly.
\end{itemize}
\section{Week 5}
\subsection{Grep - A 'CTRL-F' From the Command Line}
When working with the command line, we have the unique opportunity to see older versions of computer tools that we are accustomed to today. In modern environments, if you want to find something on a webpage, textbook, or even in your \texttt{.java} file in Eclipse, the first thing that probably comes to you head is the command '\texttt{CTRL + F}'. In a command line environment, the command that preceded this functionality is known as \texttt{grep}.
\subsubsection{Why is it called that?}
The name of the command itself has an interesting origin. The most basic text editor on UNIX systems is regarded by many as \texttt{ed}, and on that text editor, one was able to globally search the file for a regular expression (which you'll learn morere about in CMSC330), then print what was found using the command '\texttt{g/re/p}'. This gave way to the name "grep".
\subsubsection{Why it's useful}
As you'd imagine, grep can be used to simply search the files we have for keywords. Let's take a look at some examples. You can follow along if you head over to \texttt{216public/labs/Week5/lab 1/grep\_example}.
Let's take a look at the text files that we will be searching through, as examples.
\begin{lstlisting}[language=C]
The college is in
the east coast.
\end{lstlisting}
\begin{center}
\textbf{data.txt}
\end{center}
\begin{lstlisting}[language=C]
The project is about hashing,
files, structures,
pointers
and dynamic memory allocation (and more pointers).
\end{lstlisting}
\begin{center}
\textbf{summary.txt}
\end{center}
These two files are in the same directory, and for the purpose of the examples I'll go over, let's assume that we're currently in the directory that contains both these files.\newline
\texttt{grep} works like this: you provide it a key phrase and a file location, and it'll take care of the rest. If you want more technical information on how grep commands should be structured, I encourage you to take a look at \texttt{man grep}. \newline
If you execute the command \texttt{grep college data.txt}, then grep will print out the line that it found your keyword on. (the output for that command will be \texttt{The college is in}.)\newline
Where \texttt{grep} really shines is when you want to mix in some of the cool UNIX keywords we've been learning. As a quick example, let's say you wanted to search for all the occurrences of 'is' in all the text files you had in the file. To do that, you'd simply execute the following command.\newline