You are posing good questions, and I don't have easy answers to them, though I rather doubt SRB code could be used to answer these questions.
You are correct when you say TIMEUSED probably won't answer questions about resource usage for database calls, though I'm not so sure an SRB function can expose this information without very deep knowledge about how the database processor works internally, knowledge which I'm not certain IBM has divulged. SRB usage can be interpreted at many different levels, but one should be extremely careful about this interpretation. One possible use of an SRB function in a database application is that it can be used to communicate the result of a database function back to the requester.
When I did my SRB code almost 30 years ago it was to, in part, do the kinds of performance analysis you are curious about, but the target was in source, so the deep knowledge of its functioning could be easily acquired. Sadly, most database systems are OCO, so it will be much more difficult for a user (us, in other words) without access to the source to acquire this knowledge.
Reviewing the intelligence gathered using my code, I think the only useful point we extracted was resisted by the product vendor. This was something that was inserted relatively late in development; the SRB code was able to detect when the object being analyzed was stuck on a page fault, and we located a single point where there were a lot of avoidable page faults to extract just a couple of data bytes that could be stashed in a more heavily used storage area, one unlikely to get a repeated page fault, shortly after the application obtained the page containing the data rather than many minutes later after the page had been sent off to the page dataset. I was convinced that other data acquired by the SRB code was misunderstood, and, so, misused. Other data obtained by the code didn't reveal much that was a surprise. The largest CPU user was EXCP, which was no surprise. Code that was thought to be a significant CPU user turned out not to be, which was a bit of a surprise. While it was possible to make minor improvements in other code, the EXCP hammer head, which we could do nothing about, made any net improvement we could do elsewhere nearly pointless.
The target had little direct involvement with user code, and my SRB code did not measure this involvement. Indeed, it could not measure this involvement, since it would have to interrogate user address spaces, which would have greatly increased its overhead and risks, both in the SRB code and its driver that issued the SCHEDULE macro to start the SRB code.
I'm not sure SRB code could track a database function back to a user request, which I'm thinking is one of your goals, though it might be possible.
One weakness in my approach is the targeted application ran a number of subtasks, and we made no attempt to measure subtask usage, something that I now think would have been useful.