Hi Christoph,
If you load the loop bound first into a local variable, then the #pragma no_alias isn't necessary (in fact, it isn't helping you anyway, as I noted above). The use of #pragma loop_count doesn't help to reduce the number of instructions in the loop itself (I always get a single instruction loop when the upper bound is a local variable). However it does help with the code around the loop - you're correct that a check that count > 0 is required to decide whether to enter the loop at all. Compare the following without the loop_count pragma:
R0 = W[P1] (X);
CC = R0 <= 0;
if CC jump .P33L3 ;
.P33L6:
P1 = R0;
R0 = 0;
LOOP .P33L2L LC0 = P1;
.P33L2:
LOOP_BEGIN .P33L2L;
[P0++] = R0;
LOOP_END .P33L2L;
with the following when I use the pragma:
R2 = R2 - R2 (NS) || R1 = W[P1] (X);
P1 = R1;
P0 = R0;
LOOP .P33L2L LC0 = P1;
.P33L2:
LOOP_BEGIN .P33L2L;
[P0++] = R2;
LOOP_END .P33L2L;
The loop kernel is identical, but the latter code is slightly more efficient to get to the loop.
If you're seeing different behaviour from this, perhaps you could post a complete, compilable example, and specify the compiler options and tools versions you are using?
All the best,
Michael.