This is the mail archive of the ecos-patches@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug 1001607] Cortex-M4F architectural Floating Point Support


Please do not reply to this email. Use the web interface provided at:
http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001607

--- Comment #10 from Ilija Kocho <ilijak@siva.com.mk> 2012-06-24 16:00:34 BST ---
Hi Jifl

Thank you for your review.

(In reply to comment #9)
> Hi Ilija,
> 
> Thanks for all this work. It looks really good! I've gone through it all now,
> and unsurprisingly have some observations. It's quite a big patch so it ends up
> being quite long...
> 
> Architecture HAL:
> 
> - The CDL seems to be getting much more complex than it really needs to be.
> There is now CDL for: CYGHWR_HAL_CORTEXM (== M3 or M4),
> CYGINT_HAL_CORTEXM4_CODE, CYGOPT_HAL_CORTEXM4_CODE, CYGOPT_HAL_CORTEXM3_CODE,
> CYGINT_HAL_CORTEXM_FPU, CYGINT_HAL_CORTEXM_FPV4_SP_D16, CYGHWR_HAL_CORTEXM_FPU,
> and to an extent CYGHWR_HAL_CORTEXM_FPU_SWITCH=="NONE", which all have
> overlapping behaviour with respect to each other and effects on each other.
> There is unnecessary redundancy here.
> 
> I think we can simply remove CYGINT_HAL_CORTEXM4_CODE, CYGOPT_HAL_CORTEXM4_CODE
> and CYGOPT_HAL_CORTEXM3_CODE because even if building with -mcpu=cortex-m4, gcc
> defaults to soft floating-point. So that means if CYGHWR_HAL_CORTEXM == "M4"
> then it's safe to use -mcpu=cortex-m4. So CYGBLD_GLOBAL_XFLAGS_ARCH can go too.

I am aware of this excessive complexity, as it gave me some headache but I'm
afraid that for the time being it is necesary. The reason for keeping
-mcpu=cortex-m3 option even for Cortex-M4 cores, is compatibility with the
actual eCos compiler, gcc-4.3.2, that doesn't support/recognize
-mcpu=cortex-m4. We can reconsider once we switch to gcc-4.6.3 or newer. 

Provided that we keep selection I would be happier with radio buttons. I have
seen some of them (such as schedulers) but haven't figured out how it's done.
What's the trick?

FAOD, I allow that it could be simpler. I have tried a number of code variants
and stayed at the one that gives complete conflict resolution in configtool for
most of user interactions.
To my understanding is_substr() is kind of tough task for configtool and I
would avoid it (see discussion on FPU CFLAGS below).

> 
> This line can just go away:
> +        requires { !is_active(CYGOPT_HAL_CORTEXM4_CODE) implies
> CYGHWR_HAL_CORTEXM_FPU == 0 }
> 
> - CYGHWR_HAL_CORTEXM_FPU should be just:
>             active_if       CYGINT_HAL_CORTEXM_FPU
>             default_value   1
> 
> You don't need to set the default value specially when its inactive anyway.

I know, only it doesn't look beautiful (in my eyes) if some checkbox is checked
and inactive. But probably I could get used to.

> 
> - In CYGHWR_HAL_CORTEXM_FPU's description there's a "\n". Also I wouldn't call
> it "ordinary" behaviour. Perhaps "simplest".

I took the description from 386, but I agree "simplest" is better, I'll change.

> 
> - Here's a possible outline of a simplification. What do you think?
>   cdl_option CYGHWR_HAL_CORTEXM_FPU_FLAGS {
>      display    "FPU build flags"
>      calculated { (CYGHWR_HAL_CORTEXM_FPU ? " -mfloat-abi=hard" : "" ) .
> (CYGINT_HAL_CORTEXM_FPV4_SP_D16 ? " -mfpu=fpv4-sp-d16" : "" ) }
>    }
>    requires is_substr(CYGBLD_GLOBAL_CFLAGS, CYGHWR_HAL_CORTEXM_FPU_FLAGS )
>    requires is_substr(CYGBLD_GLOBAL_LDFLAGS, CYGHWR_HAL_CORTEXM_FPU_FLAGS )
>   requires { !CYGHWR_HAL_CORTEXM_FPU implies !is_substr(CYGBLD_GLOBAL_CFLAGS,
> "-mfpu=fpv4-sp-d16") && !is_substr(CYGBLD_GLOBAL_CFLAGS, "-mfloat-abi=hard") }
>   requires { !CYGHWR_HAL_CORTEXM_FPU implies !is_substr(CYGBLD_GLOBAL_LDFLAGS,
> "-mfpu=fpv4-sp-d16") && !is_substr(CYGBLD_GLOBAL_LDFLAGS, "-mfloat-abi=hard") }

Since you ask me, I would get rid of is_substr(). Something like this:

cdl_option CYGHWR_HAL_CORTEXM_FLOAT_ABI_FLAG {
      flavor data
      no_define
      display    "Float ABI flag"
      calculated { CYGHWR_HAL_CORTEXM_FPU ? " -mfloat-abi=hard" : "" }
}

cdl_option CYGHWR_HAL_CORTEXM_FPU_FLAG {
      flavor data
      no_define
      display    "FPU build flags"
      calculated { (CYGHWR_HAL_CORTEXM_FPU && CYGINT_HAL_CORTEXM_FPV4_SP_D16) ?
                 " -mfpu=fpv4-sp-d16" : "" }
}

Ditto for -mcpu=cortex-m3/m4 (CYGHWR_HAL_CORTEXM_CPU_FLAG).

Then we concatenate all these in CYGBLD_GLOBAL_CFLAGS.

cdl_option CYGBLD_GLOBAL_CFLAGS {
    display "Global compiler flags"
    flavor  data
    no_define
    default_value { CYGBLD_GLOBAL_WARNFLAGS . CYGHWR_HAL_CORTEXM_CPU_FLAG .
       " -mthumb -g -O2 ffunction-sections -fdata-sections -fno-rtti
-fno-exceptions" .
       CYGHWR_HAL_CORTEXM_FLOAT_ABI_FLAG . CYGHWR_HAL_CORTEXM_FPU_FLAG }
}

I think that in long term this is better solution since it is easy to expand
with new FPUs in future.

> 
> - There is a quite important design issue. A lot of things assume the only
> processor with an FPU is the Cortex-M4 and that FPU has the fpv4-sp-d16
> arrangement. This is unlikely to remain the case, and it's easy to anticipate.
> The M4 FPU is a bit cut-down after all, so it's easy to see a full VFP4 unit
> being added in a later Cortex-M.  So we should be wrapping up much more
> functionality in macros that can be supplied according to the FPU design, the
> most obvious alternative possibility being a full VFP4 unit, to give an idea of
> what to abstract.
> 
> With that in mind, I think there needs to be a fpv4-sp-d16.h header file,
> probably included from cortexm_fpu.h (which could in future then instead
> include alternatives e.g. a vfp4.h header once that became relevant). That
> FPU-specific header will define a number of FPU-model-specific properties and
> macros used for the implementation, such as from cortexm_stub.h and context.S. 
> 

Funny, usually I am warned for over-generalization, now it'oposite :-).
I will consuder this and try to construct something. I wonder if we will need
cortexm_fpu.h (yet another level of indirection) or we can simply rename
cortexm_fpu.h into fpv4-sp-d16.h. Then we have cortexm_fpu.c and
hal_cortexm_fpu.cdl...

In the text below i'll simply snip related text segments. Don't worry, nothing
is discarded.

[snip]

> - In cortexm_regs.h, there has been the addition of
> CYGARC_DSB()/CYGARC_CORTEXM_DATA_SYNC_BARRIER() and
> CYGARC_ISB()/CYGARC_CORTEXM_INSTR_SYNC_BARRIER(). I think you should choose one
> name for each and stick with it.

I'll keep the short one.

> 
> Also for consistency with some other architectural ports, there should be a
> HAL_MEMORY_BARRIER define:
> 
> #define HAL_MEMORY_BARRIER() \
>   CYGARC_CORTEXM_DATA_SYNC_BARRIER(); CYGARC_CORTEXM_INSTR_SYNC_BARRIER()

Or:
#define HAL_MEMORY_BARRIER() CYGARC_DSB(); CYGARC(ISB)

> 
> ( This could then be used in cortexm_fpu.h, hal_fpu.c etc, to replace existing
> calls which do exactly that, although that isn't vital.)
> 
> OOI strictly that cortexm_regs.h doesn't need #ifndef __ASSEMBLER__ as its all
> macro definitions, but you can keep it like that if you want.

What's OOI?

[snip]

> 
> - You don't save the floating point context for interrupts, but do for
> exceptions. There's an argument that since both interrupt and exception
> handlers could call arbitrary code which therefore may use FP, the context
> should be saved. There's also an argument that unless this is a ROM monitor and
> including stubs, you don't need to save the exception info either. I think
> there may be merit in having two config options to govern whether to save FP
> state for exceptions and interrupts. The default value for the exceptions one
> would be CYGSEM_HAL_ROM_MONITOR || CYGDBG_HAL_DEBUG_GDB_INCLUDE_STUBS, the
> default for the interrupt one would be 0.
> 

I expected some arguments here... I can't imagine myself using FPU in ISR or
DSR, but if it is an option, disabled by default here are some considerations:
  - ALL: Implementation is easy, we just skip disabling of automatic stack
saving.
  - LAZY: Lazy saving introduces variable frame length that requires handling
that will produce worst case latency even worse than for ALL. It requires suble
intervention in ISR and may unnecessarily delay the project.

Therefore I would consider immediate implementation for ALL, but for LAZY i
would prefer to put "Not implemented in current release" for the time being.

> - Just as a general coding standards thing, remember to put brackets round uses
> of __regs_p in the new HAL_THREAD_INIT_FPU_* macros.
> 
> - FYI you can use CYG_EMPTY_STATEMENT instead of " CYG_MACRO_START
> CYG_MACRO_END"

Useful information, thanks.


[snip]

> 
> - hal_io.h defines UFSR. I think you should give the comment as "Usage fault
> status reg" rather than just "Misc".

OK.

> 
> - In context.S, CYGARC_FPU_CPACR_ENABLE could usefully be put in cortexm_fpu.h
> where it could also be used by hal_cortexm_fpu_enable/disable (or their
> replacement macros, as above).
> 
> - The title at the top of hal_fpu.c should be something like "HAL FPU support".
> 
> - Why does it include <pkgconf/kernel.h> ?

It is a remaining form older file that I used as template. There might be other
similar cases. Also I don't have complete overview what to include. I am going
to clean-up once the packages are close to check in.

[snip]

> - hal_misc.c: You should ignore the startup type for the call to hal_init_fpu.
> We can't assume for definite a previous ROM monitor was built with FPU support
> enabled, so we can't rely on it (and equally hal_init_fpu() should not just
> enable the FPU if needed, but also explicitly disable it if this is lazy
> switching mode and it's RAM startup).

OK.

> 
> - hal_misc.c: hal_get/set_gdb_registers. This duplication can definitely be
> avoided with the use of macros (including a default one for the pretend FPA
> regs). And then the details would be provided via cortexm_fpu.h by my proposed
> fpv4-sp-d16.h.
> 
> - In vectors.S: This change is accidental:
>   -// NOTE: At present this implementation does not permit an exception
>   +// NOTE:p At present this implementation does not permit an exception
> 
> - Again we should definitely aim to replace FPU-specific parts with macros,
> rather than duplicating entire functions such as hal_default_exception_vsr.
> Either with the C preprocessor, or GAS .macro. If the latter you could have an
> include/fpu.inc file, which will #include fpv4-sp-d16.h again (which would then
> need to be asm-include safe) to configure itself - up to you. You can again
> look at the i386 arch HAL to see what it did in the same situation -
> specifically arch.inc, although it is more complex than we would need to be in
> cortex-m.
> 
> But I think something needs to be done otherwise it becomes much harder to
> follow the asm.

My intention was to leave as much of present code, especially ASM verbatim.
Hence the code duplication.
But now we can go on with macros.

> 
> - Given what I suggested above about being able to save FP context from ISRs,
> something would be needed with the hal_default_interrupt_vsr here, although
> with some care, I would expect the same macros should be able to be reused for
> both exceptions and interrupts.
> 

The automatic stack frame is different than the one used in
default_exception_vsr

> - hal_usagefault_exception_vsr is almost identical to
> hal_default_exception_vsr. I don't see why hal_default_exception_vsr can't be
> used, and instead treated specially in hal_deliver_exception() instead, which
> would be less code overall.
> 

Cortex-M architecture, unlike pure ARM, provides us with individual
exception/interrupt vectors and it would be waste not to use them. We get both
simpler, cleaner and faster code. I would rather extend this approach to other
other exception handlers.

However, we can put the common code sequences in macros.

[snip] I really stand for individual vectors.

> 
> - Alternatively, if there's a reason to keep hal_usagefault_exception_vsr
> separate from hal_default_exception_vsr, then it's still the case it only needs
> to be included if CYGHWR_HAL_CORTEXM_FPU_SWITCH_LAZY I think.

OK. 

> 
> Tests:
> 
> - For fptestf.c and fpinttest.c: The test for whether the test is NA should
> also include <pkgconf/isoinfra.h> and check CYGINT_ISO_C_CLOCK_FUNCS is
> defined, and also CYGPKG_LIBM for fabsf() (which should then not be
> prototyped). Note this uses a libm implementation which hasn't been checked in
> yet. 

FYI fabsf() is a GCC built-in function, no libm involved, hence declaration. It
is of course temporary until float libm gets committed (thanks for the
reminder).

[snip]

> 
>  I think stdio.h isn't needed? And the include of math.h should be within the
> #if block for the test requirements otherwise the test won't build because
> math.h doesn't exist. You should include <string.h> for strlen and check for
> CYGINT_ISO_STRING_STRFUNCS.

I am using printf() with serial device driver occasionally in order to enforce
some interrupt traffic.

> 
> - In thread_switch_fpu.cxx you have a FIXME in thread_fp() about gcc maybe
> optimising things away. I think it could. A simple solution is to make op1 and
> op2 be global arrays, indexed by the thread index (the indx variable). That
> should make gcc treat them more carefully. At the very end,
> thread_swotch_fpu.cxx -> thread_switch_fpu.cxx :-).

I'll check.

> 
> Don't let the length of this make you think that there are big problems
> here.... overall this is great work!

Your review is great work too. Thanks.

Ilija

-- 
Configure bugmail: http://bugs.ecos.sourceware.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]