of Assembly][Randall Hyde]
Art of Assembly Language: Chapter Twenty-Five
- Chapter 25 - Optimizing Your Programs
- 25.1 - When to Optimize, When Not to Optimize
- 25.2 - How Do You Find the Slow Code in
- 25.3 - Is Optimization Necessary?
- 25.4 - The Three Types of Optimization
- 25.5 - Improving the Implementation
of an Algorithm
Copyright 1996 by Randall Hyde
All rights reserved.
Duplication other than for immediate display through a browser is prohibited
by U.S. Copyright Law.
This material is provided on-line as a beta-test of this text. It is for
the personal use of the reader only. If you are interested in using this
material as part of a course, please contact
Supporting software and other materials are available via anonymous ftp
from ftp.cs.ucr.edu. See the "/pub/pc/ibmpcdir" directory for
details. You may also download the material from "Randall Hyde's Assembly
Language Page" at URL:
This document does not contain the laboratory exercises, programming assignments,
exercises, or chapter summary. These portions were omitted for several reasons:
either they wouldn't format properly, they contained hyperlinks that were
too much work to resolve, they were under constant revision, or they were
not included for security reasons. Such omission should have very little
impact on the reader interested in learning this material or evaluating
This document was prepared using Harlequin's Web Maker 2.2 and Quadralay's
Webworks Publisher. Since HTML does not support the rich formatting options
available in Framemaker, this document is only an approximation of the actual
chapter from the textbook.
If you are absolutely dying to get your hands on a version other than HTML,
you might consider having the UCR Printing a Reprographics Department run
you off a copy on their Xerox machines. For details, please read the following
EMAIL message I received from the Printing and Reprographics Department:
Hello Again Professor Hyde,
We are currently working on ways to publish this text in a form other than
HTML (e.g., Postscript, PDF, Frameviewer, hard copy, etc.). This, however,
is a low-priority project. Please do not contact Randall Hyde concerning
this effort. When something happens, an announcement will appear on "Randall
Hyde's Assembly Language Page." Please visit this WEB site at http://webster.ucr.edu
for the latest scoop.
Dallas gave me permission to take orders for the Computer Science 13 Manuals.
We would need to take charge card orders. The only cards we take are: Master
Card, Visa, and Discover. They would need to send the name, numbers, expiration
date, type of card, and authorization to charge $95.00 for the manual and
shipping, also we should have their phone number in case the company has
any trouble delivery. They can use my e-mail address for the orders and
I will process them as soon as possible. I would assume that two weeks would
be sufficient for printing, packages and delivery time.
I am open to suggestions if you can think of any to make this as easy as
Thank You for your business,
Kathy Chapman, Assistant
Printing and Reprographics
University of California
Art of Assembly Bug Report Submissions
Did you find an error in The Art of Assembly Language Programming?
You can let me know by using the form below to report the error to me so
that I can correct the error for the next beta version. Thank you.
The Submission Form
Please provide your name and e-mail address so I can contact you if
I have any questions regarding your submission.
Chapter 25 Optimizing Your Programs
Since program optimization is generally one of the last steps in software
development, it is only fitting to discuss program optimization in the last
chapter of this text. Scanning through other texts that cover this subject,
you will find a wide variety of opinions on this subject. Some texts and
articles ignore instruction sets altogether and concentrate on finding a
better algorithm. Other documents assume you've already found the best algorithm
and discuss ways to select the "best" sequence of instructions
to accomplish the job. Others consider the CPU architecture and describe
how to "count cycles" and pair instructions (especially on superscalar
processors or processes with pipelines) to produce faster running code.
Others, still, consider the system architecture, not just the CPU architecture,
when attempting to decide how to optimize your program. Some authors spend
a lot of time explaining that their method is the "one true way"
to faster programs. Others still get off on a software engineering tangent
and start talking about how time spent optmizing a program isn't worthwhile
for a variety of reasons. Well, this chapter is not going to present the
"one true way," nor is it going to spend a lot of time bickering
about certain optimization techniques. It will simply present you with some
examples, options, and suggestions. Since you're on your own after this
chapter, it's time for you to start making some of your own decisions. Hopefully,
this chapter can provide suitable information so you can make correct decisions.
25.1 When to Optimize, When Not to Optimize
The optimization process is not cheap. If you develop a program and
then determine that it is too slow, you may have to redesign and rewrite
major portions of that program to get acceptable performance. Based on this
point alone, the world often divides itself into two camps - those who optimize
early and those who optimize
late. Both groups have good arguements; both groups have some bad arguements.
Let's take a look at both sides of this arguement.
The "optimize late" (OL) crowd uses the 90/10
arguement: 90% of a program's execution time is spent in 10% of the code.
If you try to optimize every piece of code you write (that is, optimize
the code before you know that it needs to be optimized), 90% of your effort
will go to waste. On the other hand, if you write the code in a normal fashion
first and then go in an optimize, you can improve your program's performance
with less work. After all, if you completely removed the 90% portion of
your program, your code would only run about 10% faster. On the other hand,
if you completely remove that 10% portion, your program will run about 10
times faster. The math is obviously in favor of attacking the 10%. The OL
crowd claims that you should write your code with only the normal attention
to performance (i.e., given a choice between an O(n2) and an O(n lg n) algorithm,
you should choose the latter). Once the program is working correctly you
can go back and concentrate your efforts on that 10% of the code that takes
all the time.
The OL arguements are persuasive. Optimization is a laborious and difficult
process. More often that not there is no clear-cut way to speed up a section
of code. The only way to determine which of several different options is
better is to actually code them all up and compare them. Attempting to do
this on the entire program is impractical. However, if you can find that
10% of the code and optimize that, you've reduced your workload by 90%,
very inviting indeed. Another good arguement the OL group uses is that few
programmers are capable of anticipating where the time will be spent in
a program. Therefore, the only real way to determine where a program spends
its time is to instrument it and measure which functions consume the most
time. Obviously, you must have a working program before you can do this.
Once again, they argue that any time spent optimizing the code beforehand
is bound to be wasted since you will probably wind up optimizing that 90%
that doesn't need it.
There are, however, some very good counter arguments to the above. First,
when most OL types start talking about the 90/10 rule, there is this implicit
suggestion that this 10% of the code appears as one big chunk in the middle
of the program. A good programmer, like a good surgeon, can locate this
malignant mass, cut it out, and replace with with something much faster,
thus boosting the speed of your program with only a little effort. Unfortunately,
this is not often the case in the real world. In real programs, that 10%
of the code that takes up 90% of the execution time is often spread all
over your program. You'll get 1% here, 0.5% over there, a "gigantic"
2.5% in one function, and so on. Worse still, optimizing 1% of the code
within one function often requires that you modify some of the other code
as well. For example, rewriting a function (the 1%) to speed it up quite
a bit may require changing the way you pass parameters to that function.
This may require rewriting several sections of code outside that slow 10%.
So often you wind up rewriting much more than 10% of the code in order to
speed up that 10% that takes 90% of the time.
Another problem with the
90/10 rule is that it works on percentages, and the percentages change during
optimization. For example, suppose you located a single function that was
consuming 90% of the execution time. Let's suppose you're Mr. Super Programmer
and you managed to speed this routine up by a factor of two. Your program
will now take about 55% of the time to run before it was optimized. If you
triple the speed of this routine, your program takes a total of 40% of the
original time to execution. If you are really great and you manage to get
that function running nine times faster, your program now runs in 20% of
the original time, i.e., five times faster.
Suppose you could get that function running nine times faster. Notice that
the 90/10 rule no longer applies to your program. 50% of the execution time
is spent in 10% of your code, 50% is spent in the other 90% of your code.
And if you've managed to speed up that one function by 900%, it is very
unlikely you're going to squeeze much more out of it (unless it was really
bad to begin with). Is it worthwhile messing around with that other 90%
of your code? You bet it is. After all, you can improve the performance
of your program by 25% if you double the speed of that other code. Note,
however, that you only get a 25% performance boost after you optimized the
10% as best you could. Had you optimized the 90% of your program first,
you would only have gotten a 5% performance improvement; hardly something
you'd write home about. Nonetheless, you can see some situations where the
90/10 rule obviously doesn't apply and you can see some cases where optimizing
that 90% can produce a good boost in performance. The OL group will smile
and say "see, that's the benefit of optimizing late, you can optimize
in stages and get just the right amount of optimization you need."
The optimize early (OE) group uses the flaw in percentage arithmetic to
point out that you will probably wind up optimizing a large portion of your
program anyway. So why not work all this into your design in the first place?
A big problem with the OL strategy is that you often wind up designing and
writing the program twice - once just to get it functional, the second time
to make it practical. After all, if you're going to have to rewrite that
90% anyway, why not write it fast in the first place? The OE people also
point out that although programmers are notoriously bad at determining where
a program spends most of its time, there are some obvious places where they
know there will be performance problems. Why wait to discover the obvious?
Why not handle such problem areas early on so there is less time spent measuring
and optimizing that code?
Like so many other arguements in Software Engineering, the two camps become
quite polarized and swear by a totally pure approach in either direction
(either all OE or all OL). Like so many other arguements in Computer Science,
the truth actually lies somewhere between these two extremes. Any project
where the programmer set out to design the perfect program without worry
about performance until the end is doomed. Most programmers in this scenario
write terribly slow code. Why? Because it's easier to do so and they can
always "solve the performance problem during the optimization phase."
As a result, the 90% portion of the program is often so slow that even if
the time of the other 10% were reduced to zero, the program would still
be way too slow. On the other hand, the OE crowd gets so caught up in writing
the best possible code that they miss deadlines and the product may never
There is one undeniable fact that favors the OL arguement - optimized code
is difficult to understand and maintain. Furthermore, it often contains
bugs that are not present in the unoptimized code. Since incorrect code
is unacceptable, even if it does run faster, one very good arguement against
optimizing early is the fact that testing, debugging, and quality assurance
represent a large portion of the program development cycle. Optimizing early
may create so many additional program errors that you lose any time saved
by not having to optimize the program later in the development cycle.
The correct time to optimize a program is, well, at the correct time. Unfortunately,
the "correct time" varies with the program. However, the first
step is to develop program performance requirements along with the other
program specifications. The system analyst should develop target response
times for all user interactions and computations. During development and
testing, programmers have a target to shoot for, so they can't get lazy
and wait for the optimization phase before writing code that performs reasonably
well. On the other hand, they also have a target to shoot for and once the
code is running fast enough, they don't have to waste time, or make their
code less maintainable; they can go on and work on the rest of the program.
Of course, the system analyst could misjudge performance requirements, but
this won't happen often with a good system design.
Another consideration is when to perform what. There are several types of
optimizations you can perform. For example, you can rearrange instructions
to avoid hazards to double the speed of a piece of code. Or you could choose
a different algorithm that could run twice as fast. One big problem with
optimization is that it is not a single process and many types of optimizations
are best done later rather than earlier, or vice versa. For example, choosing
a good algorithm is something you should do early on. If you decide to use
a better algorithm after implementing a poor one, most of the work on the
code implementing the old algorithm is lost. Likewise, instruction scheduling
is one of the last optimizations you should do. Any changes to the code
after rearranging instructions for performance may force you to spend time
rearranging them again later. Clearly, the lower level the optimization
(i.e., relying upon CPU or system parameters), the later the optimization
should be. Conversely, the higher level the optimization (e.g., choice of
algorithm), the sooner should be the optimization. In all cases, though,
you should have target performance values in mind while developing code.
25.2 How Do You Find the Slow Code in Your
Although there are problems with the 90/10 rule, the concept behind
it is basically solid - programs tend to spend a large amount of their time
executing only a small percentage of the code. Clearly, you should optimize
the slowest portion of your code first. The only problem is how does one
find the slowest code in a program?
There are four common techniques programmers use to find the "hot spots" (the places where programs spend most of their
time). The first is by trial and error. The second is to optimize everything.
The third is to analyze the program. The fourth is to use a profiler or other software monitoring tool to measure the
performance of various parts of a program. After locating a hot spot, the
programmer can attempt to analyze that section of the program.
The trial and error technique is, unfortunately, the most common strategy.
A programmer will speed up various parts of the program by making educated
guesses about where it is spending most of its time. If the programmer guesses
right, the program will run much faster after optimization. Experienced
programmers often use this technique successfully to quickly locate and
optimize a program. When the programmer guesses correctly, this technique
minimizes the amount of time spent looking for hot spots in a program. Unfortunately,
most programmers make fairly poor guesses and wind up optimizing the wrong
sections of code. Such effort often goes to waste since optimizing the wrong
10% will not improve performance significantly. One of the prime reasons
this technique fails so often is that it is often the first choice of inexperienced
programmers who cannot easily recognize slow code. Unfotunately, they are
probably unaware of other techniques, so rather than try a structured approach,
they start making (often) uneducated guesses.
Another way to locate and optimize the slow portion of a program is to optimize
everything. Obviously, this technique does not work well for large programs,
but for short sections of code it works reasonably well. Later, this text
will provide a short example of an optimization problem and will use this
technique to optimize the program. Of course, for large programs or routines
this may not be a cost effective approach. However, where appropriate it
can save you time while optimizing your program (or at least a portion of
your program) since you will not need to carefully analyze and measure the
performance of your code. By optimizing everything, you are sure to optimize
the slow code.
The analysis method is the most difficult of the
four. With this method, you study your code and determine where it will
spend most of its time based on the data you expect it to process. In theory,
this is the best technique. In practice, human beings generally demonstrate
a distaste for such analysis work. As such, the analysis is often incorrect
or takes too long to complete. Furthermore, few programmers have much experience
studying their code to determine where it is spending most of its time,
so they are often quite poor at locating hot spots by studying their listings
when the need arises.
Despite the problems with program analysis, this is the first technique
you should always use when attempting to optimize a program. Almost all
programs spend most of their time executing the body of a loop or recursive
function calls. Therefore, you should try to locate all recursive function
calls and loop bodies (especially nested loops) in your program. Chances
are very good that a program will be spending most of its time in one of
these two areas of your program. Such spots are the first to consider when
optimizing your programs.
Although the analytical method provides a good way to locate the slow code
in a program, analyzing program is a slow, tedious, and boring process.
It is very easy to completely miss the most time consuming portion of a
program, especially in the presence of indirectly recursive function calls.
Even locating time consuming nested loops is often difficult. For example,
you might not realize, when looking at a loop within a procedure, that it
is a nested loop by virtue of the fact that the calling code executes a
loop when calling the procedure. In theory, the analytical method should
always work. In practice, it is only marginally successful given that fallible
humans are doing the analysis. Nevertheless, some hot spots are easy to
find through program analysis, so your first step when optimizing a program
should be analysis.
Since programmers are notoriously bad at analyzing programs to find their
hot spots, it would make since to try an automate this process. This is
precisely what a profiler can do for you. A profiler is a small program
that measures how long your code spends in any one portion of the program.
A profiler typically works by interrupting your code periodically and noting
the return address. The profiler builds a histogram of interrupt return
addresses (generally rounded to some user specified value). By studying
this histogram, you can determine where the program spends most of its time.
This tells you which sections of the code you need to optimize. Of course,
to use this technique, you will need a profiler program. Borland, Microsoft,
and several other vendors provide profilers and other optimization tools.
25.3 Is Optimization Necessary?
Except for fun and education, you should never approach a project with
the attitude that you are going to get maximal performance out of your code.
Years ago, this was an important attitude because that's what it took to
get anything decent running on the slow machines of that era. Reducing the
run time of a program from ten minutes to ten seconds made many programs
commercially viable. On the other hand, speeding up a program that takes
0.1 seconds to the point where it runs in a millisecond is often pointless.
You will waste a lot of effort improving the performance, yet few people
will notice the difference.
This is not to say that speeding up programs from 0.1 seconds to 0.001 seconds
is never worthwhile. If you are writing a data capture program that requires
you to take a reading every millisecond, and it can only handle ten readings
per second as currently written, you've got your work cut out for you. Furthermore,
even if your program runs fast enough already, there are reasons why you
would want to make it run twice as fast. For example, suppose someone can
use your program in a multitasking environment. If you modify your program
to run twice as fast, the user will be able to run another program along
side yours and not notice the performance degradation.
However, the thing to always keep in mind is that you need to write software
that is fast enough. Once a program produces results instantaneously (or
so close to instantaneous that the user can't tell), there is little need
to make it run any faster. Since optimization is an expensive and error
prone process, you want to avoid it as much as possible. Writing programs
that run faster than fast enough is a waste of time. However, as is obvious
from the set of bloated application programs you'll find today, this really
isn't a problem, most programming produce code that is way too slow, not
way too fast.
A common reason stated for not producing optimal
code is advancing hardware design. Many programmers and managers feel that
the high-end machines they develop software on today will be the mid-range
machines two years from now when they finally release their software. So
if they design their software to run on today's very high-end machines,
it will perform okay on midrange machines when they release their software.
There are two problems with the approach above. First, the operating system
running on those machines two years from now will gobble a large part of
the machine's resources (including CPU cycles). It is interesting to note
that today's machines are hundreds of times faster than the original 8088
based PCs, yet many applications actually run slower than those that ran
on the original PC. True, today's software provides many more features beyond
what the original PC provided, but that's the whole point of this arguement
- customers will demand features like multiple windows, GUI, pull-down menus,
etc., that all consume CPU cycles. You cannot assume that newer machines
will provide extra clock cycles so your slow code will run faster. The OS
or user interface to your program will wind up eating those extra available
So the first step is to realistically determine the performance requirements
of your software. Then write your software to meet that performance goal.
If you fail to meet the performance requirements, then it is time to optimize
your program. However, you shouldn't waste additional time optimizing your
code once your program meets or exceed the performance specifications.
25.4 The Three Types of Optimization
There are three forms of optimization you can use when improving the
performance of a program. They are choosing
a better algorithm (high level optimization), implementing the algorithm better (a medium level
optmization), and "counting cycles" (a low level optimization). Each technique has its
place and, generally, you apply them at different points in the development
Choosing a better algorithm is the most highly touted optimization technique.
Alas it is the technique used least often. It is easy for someone to announce
that you should always find a better algorithm if you need more speed; but
finding that algorithm is a little more difficult. First, let us define
an algorithm change as using a fundamentally different technique to solve
the problem. For example, switching from a "bubble sort" algorithm
to a "quick sort" algorithm is a good example of an algorithm
change. Generally, though certainly not always, changing algorithms means
you use a program with a better Big-Oh function. For example, when switching
from the bubble sort to the quick sort, you are swapping an algorithm with
an O(n2) running time for one with an O(n lg n) expected running time.
You must remember the restrictions on Big-Oh functions when comparing algorithms.
The value for n must be sufficiently large to mask the effect of hidden
constant. Furthermore, Big-Oh analysis is usually worst-case and may not
apply to your program. For example, if you wish to sort an array that is
"nearly" sorted to begin with, the bubble sort algorithm is usually
much faster than the quicksort algorithm, regardless of the value for n.
For data that is almost sorted, the bubble sort runs in almost O(n) time
whereas the quicksort algorithm runs in O(n2) time.
The second thing to keep in mind is the constant itself. If two algorithms
have the same Big-Oh function, you cannot determine any difference between
the two based on the Big-Oh analysis. This does not mean that they will
take the same amount of time to run. Don't forget, in Big-Oh analysis we
throw out all the low order terms and multiplicative constants. The asymptotic
notation is of little help in this case.
To get truly phenomenal performance improvements requires an algorithmic
change to your program. However, discovering an O(n lg n) algorithm to replace
your O(n2) algorithm is often difficult if a published solution does not
already exist. Presumably, a well-designed program is not going to contain
many obvious algorithms you can dramatically improve (if they did, they
wouldn't be well-designed, now, would they?). Therefore, attempting to find
a better algorithm may not prove successful. Nevertheless, it is always
the first step you should take because the following steps operate on the
algorithm you have. If you perform the other steps on a bad algorithm and
then discover a better algorithm later, you will have to repeat these time-consumings
steps all over again on the new algorithm.
There are two steps to discovering a new algorithms: research and development.
The first step is to see if you can find a better solution in the existing
literature. Failing that, the second step is to see if you can develop a
better algorithm on your own. The key thing is to budget an appropriate
amount of time to these two activities. Research is an open-ended process.
You can always read one more book or article. So you've got to decide how
much time you're going to spend looking for an existing solution. This might
be a few hours, days, weeks, or months. Whatever you feel is cost-effective.
You then head to the library (or your bookshelf) and begin looking for a
better solution. Once your time expires, it is time to abandon the research
approach unless you are sure you are on the right track in the material
you are studying. If so, budget a little more time and see how it goes.
At some point, though, you've got to decide that you probably won't be able
to find a better solution and it is time to try to develop a new one on
While searching for a better solution, you should study the papers, texts,
articles, etc., exactly as though you were studying for an important test.
While it's true that much of what you study will not apply to the problem
at hand, you are learning things that will be useful in future projects.
Furthermore, while someone may not provide the solution you need, they may
have done some work that is headed in the same direction that you are and
could provide some good ideas, if not the basis, for your own solution.
However, you must always remember that the job of an engineer is to provide
a cost-effective solution to a problem. If you waste too much time searching
for a solution that may not appear anywhere in the literature, you will
cause a cost overrun on your project. So know when it's time to "hang
it up" and get on with the rest of the project.
Developing a new algorithm on your own is also open-ended. You could literally
spend the rest of your life trying to find an efficient solution to an intractible
problem. So once again, you need to budget some time for this process accordingly.
Spend the time wisely trying to develop a better solution to your problem,
but once the time is exhausted, it's time to try a different approach rather
than waste any more time chasing a "holy grail."
Be sure to use all resources at your disposal when trying to find a better
algorithm. A local university's library can be a big help. Also, you should
network yourself. Attend local computer club meetings, discuss your problems
with other engineers, or talk to interested friends, maybe they're read
about a solution that you've missed. If you have access to the Internet,
BIX, Compuserve, or other technically oriented on-line services or computerized
bulletin board systems, by all means post a message asking for help. With
literally millions of users out there, if a better solution exists for your
problem, someone has probabaly solved it for you already. A few posts may
turn up a solution you were unable to find or develop yourself.
At some point or another, you may have to admit failure. Actually, you may
have to admit success - you've already found as good an algorithm as you
can. If this is still too slow for your requirements, it may be time to
try some other technique to improve the speed of your program. The next
step is to see if you can provide a better implementation for the algorithm
you are using. This optimization step, although independent of language,
is where most assembly language programmers produce dramatic performance
improvements in their code. A better implementation generally involves steps
like unrolling loops, using table lookups rather than computations, eliminating
computations from a loop whose value does not change within a loop, taking
advantage of machine idioms (such as using a shift or shift and add rather
than a multiplication), trying to keep variables in registers as long as
possible, and so on. It is surprising how much faster a program can run
by using simple techniques like those whose descriptions appear thoughout
As a last resort, you can resort to cycle counting. At this level you are
trying to ensure that an instruction sequence uses as few clock cycles as
possible. This is a difficult optimization to perform because you have to
be aware of how many clock cycles each instruction consumes, and that depends
on the instruction, the addressing mode in use, the instructions around
the current instruction (i.e., pipelining and superscalar effects), the
speed of the memory system (wait states and cache), and so on. Needless
to say, such optimizations are very tedious and require a very careful analysis
of the program and the system on which it will run.
The OL crowd always claims you should put off optimization as long as possible.
These people are generally talking about this last form of optimization.
The reason is simple: any changes you make to your program after such optimizations
may change the interaction of the instructions and, therefore, their execution
time. If you spend considerable time scheduling a sequence of 50 instructions
and then discover you will need to rewrite that code for one reason or another,
all the time you spent carefully scheduling those instructions to avoid
hazards is lost. On the other hand, if you wait until the last possible
moment to make such optimizations to you code, you will only optimize that
Many HLL programmers will tell you that a good compiler can beat a human
being at scheduling instructions and optimizing code. This isn't true. A
good compiler will beat a mediocre assembly language program a good part
of the time. However, a good compiler won't stand a chance against a good
assembly language programmer. After all, the worst that could happen is
that the good assembly language programmer will look at the output of the
compiler and improve on that.
"Counting cycles" can improve the performance of your programs.
On the average, you can speed up your programs by a factor of 50% to 200%
by making simple changes (like rearranging instructions). That's the difference
between an 80486 and a Pentium! So you shouldn't ignore the possibility
of using such optimizations in your programs. Just keep in mind, you should
do such optimizations last so you don't wind up redoing them as your code
The rest of this chapter will concentrate on the techniques for improving
the implementation of an algorithm, rather than designing a better algorithm
or using cycle counting techniques. Designing better algorithms is beyond
the scope of this manual (see a good text on algorithm design). Cycle counting
is one of those processes that differs from processor to processor. That
is, the optimization techniques that work well for the 80386 fail on a 486
or Pentium chip, and vice versa. Since Intel is constantly producing new
chips, requring different optimization techniques, listing those techniques
here would only make that much more material in this book outdated. Intel
publishes such optimization hints in their processor programmer reference
manuals. Articles on optimizing assembly language programs often appear
in technical magazines like Dr. Dobb's Journal, you should read such articles
and learn all the current optimization techniques.
- 25.1 - When to Optimize, When
Not to Optimize
- 25.2 - How Do You Find the Slow Code in
- 25.3 - Is Optimization Necessary?
- 25.4 - The Three Types of Optimization
- 25.5 - Improving the Implementation
of an Algorithm
Art of Assembly: Chapter Twenty-Five - 29 SEP 1996
[Next] [Art of Assembly][Randall