2009-02-06 00:35 -!- RazvanM(~RazvanM@96.234.232.151) has joined #tux3 2009-02-06 01:19 hey flips 2009-02-06 01:19 ACTION reads the backlog like normal 2009-02-06 03:00 -!- cam(~cam@203-219-255-75.tpgi.com.au) has joined #tux3 2009-02-06 06:30 -!- tim_dimm(~timothyhu@cpe-76-168-94-231.socal.res.rr.com) has joined #tux3 2009-02-06 07:28 -!- tim_dimm(~timothyhu@cpe-76-168-94-231.socal.res.rr.com) has joined #tux3 2009-02-06 08:02 -!- tim_dimm(~timothyhu@cpe-76-168-94-231.socal.res.rr.com) has joined #tux3 2009-02-06 09:31 -!- mingming(~mingming@32.97.110.51) has joined #tux3 2009-02-06 10:23 -!- RazvanM(~RazvanM@dazzler.isi.jhu.edu) has joined #tux3 2009-02-06 11:38 -!- tim_dimm(~timothyhu@cpe-76-168-94-231.socal.res.rr.com) has joined #tux3 2009-02-06 11:53 -!- tim_dimm(~timothyhu@cpe-76-168-94-231.socal.res.rr.com) has joined #tux3 2009-02-06 14:16 -!- tim_dimm(~timothyhu@cpe-76-168-94-231.socal.res.rr.com) has joined #tux3 2009-02-06 18:59 -!- macan(~macan@159.226.41.137) has joined #tux3 2009-02-06 20:59 ok, time to make atomic commit do more 2009-02-06 20:59 starting with our nice new superblock IO stuff 2009-02-06 21:34 two more biggish bits left to add for atomic commit: 1) btree node split logging 2) load log blocks and replay to reconstruct pinned metadata 2009-02-06 21:34 then comes the for-real debugging 2009-02-06 21:34 we're close 2009-02-06 21:36 ACTION assumes the lotus position to focus on btree node splits 2009-02-06 23:09 another idea for robust stage_delta 2009-02-06 23:09 blockdirty() can return buffer_head 2009-02-06 23:10 if page is already forked, it returns -EAGAIN 2009-02-06 23:10 if page is not forked yet, it forks 2009-02-06 23:11 I guess -EAGAIN can happen only small window 2009-02-06 23:11 well, so if EAGAIN happen, caller should restart from blockread() 2009-02-06 23:12 I looked into that strategy, and found a difficult issue 2009-02-06 23:12 but I didn't write it down 2009-02-06 23:12 :) 2009-02-06 23:12 what is issue? 2009-02-06 23:13 I will have to think 2009-02-06 23:13 I didn't not consider the EAGAIN return 2009-02-06 23:13 my key point is using the page bit to marks page was forked 2009-02-06 23:13 I mean, I did not consider the EAGAIN return 2009-02-06 23:14 ah, that is indeed a robust idea 2009-02-06 23:14 probably, I guess this would work 2009-02-06 23:15 It sounds promising 2009-02-06 23:15 but, not sure performance is good or not 2009-02-06 23:15 what is the performance issue? 2009-02-06 23:15 -EAGIN will run again from blockread 2009-02-06 23:15 restart from read? 2009-02-06 23:15 yes 2009-02-06 23:15 it sounds rare 2009-02-06 23:16 probably 2009-02-06 23:16 just not sure 2009-02-06 23:16 so the second blockread will retreive the copied page from page cache? 2009-02-06 23:17 yes 2009-02-06 23:17 meaning we do not have to change the page out from under the buffer 2009-02-06 23:18 the forked page would need to carry a (2 bit) delta number 2009-02-06 23:19 why 2bit number is needed? 2009-02-06 23:19 because a page forked into the current delta will belong to the committing delta after phase transition 2009-02-06 23:19 it would have to be forked again if dirtied in current delta 2009-02-06 23:21 forked twice? 2009-02-06 23:22 yes, if dirtied in three successive deltas 2009-02-06 23:22 I think if it's three, the page is 2009-02-06 23:23 first => second => third 2009-02-06 23:23 clean -> dirty in current -> delta transition -> dirty in committing -> dirty -> dirty in current -> delta transition -> dirty in committing -> dirty -> dirty in current 2009-02-06 23:23 ah, that's right 2009-02-06 23:23 :) 2009-02-06 23:23 good 2009-02-06 23:24 I was thinking about the wrong object, I was thinking about the buffer when I should have thought about the page 2009-02-06 23:24 it seems robust indeed 2009-02-06 23:24 less radical 2009-02-06 23:26 and buffer will also be inherit to forked page 2009-02-06 23:27 and I guess another benefit is frontend would not be needed buffer_head 2009-02-06 23:27 buffer_head is just used to remember dirty buffer 2009-02-06 23:28 yes 2009-02-06 23:28 and pointer to data may be enough for frontend 2009-02-06 23:29 well, it's another story 2009-02-06 23:30 good, well we have several weeks to analyze this 2009-02-06 23:30 yes 2009-02-06 23:30 I'll play with this for a while 2009-02-06 23:31 I will keep making incrmental steps towards atomic commit demo 2009-02-06 23:31 ok 2009-02-06 23:32 your work is main for atomic commit 2009-02-06 23:39 -!- RazvanM(~RazvanM@96.234.232.151) has joined #tux3 2009-02-06 23:45 the diskwrites in user/commit.c need to be replaced with async IO, that is sync in userspace 2009-02-06 23:45 in order to work in kernel 2009-02-06 23:45 hey flips 2009-02-06 23:46 ACTION reads the backlog 2009-02-06 23:46 hi bh 2009-02-06 23:46 it's good 2009-02-06 23:46 just sped through your area to get to San Francisco, made it 2009-02-06 23:46 hand made aio or aio thread 2009-02-06 23:49 aio is so much easier in kernel than usespace 2009-02-06 23:49 anyway, it doesn't need to be async in userspace 2009-02-06 23:49 it can just complete synchronously in the submission, and the completion code does nothing 2009-02-06 23:50 ah, I guessed you want it 2009-02-06 23:50 optimizing user space is not a high priority 2009-02-06 23:50 having it run similar code to kernel is the important thing 2009-02-06 23:50 yes, it's test purpose 2009-02-06 23:51 so, something like syncio 2009-02-06 23:52 maybe using user space definition of biovec, like you showed in your patch 2009-02-06 23:52 or maybe, abstracting the interface in a different way 2009-02-06 23:52 hiding the biovecs 2009-02-06 23:52 pass an endio callback to something like devio 2009-02-06 23:53 yes 2009-02-06 23:53 I guess that would be the simplest 2009-02-06 23:53 except, some control struct is needed 2009-02-06 23:53 well, if we need aio stuff, I guess it's not hard if simple one 2009-02-06 23:53 which in kernel is bio 2009-02-06 23:54 yes 2009-02-06 23:54 I found one issue with page bit strategy 2009-02-06 23:54 anyway, or usespace endio can just be a no-op 2009-02-06 23:54 our usespace endio can just be a no-op 2009-02-06 23:54 yes 2009-02-06 23:55 it drops ability to get forked page via buffer_head which current one does 2009-02-06 23:56 "it" means what? 2009-02-06 23:56 page bit strategy 2009-02-06 23:56 page bit strategy with buffer_head inheriting 2009-02-06 23:57 stable buffer_head for backend by page bit 2009-02-06 23:57 right 2009-02-06 23:57 thinking 2009-02-06 23:57 it's really not important how we trace the back end page, we just need to track it somehow 2009-02-06 23:58 putting a buffer head on it was one way, would could also wrap it with a bio 2009-02-06 23:58 which is probably sensible 2009-02-06 23:58 we have to wrap it with a bio some time anyway 2009-02-06 23:59 ah, too many typos 2009-02-06 23:59 it's really not important how we >track< the back end page, we just need to track it somehow 2009-02-06 23:59 putting a buffer head on it was one way, >we< could also wrap it with a bio