2009-05-04 03:43 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #tux3 2009-05-04 04:01 retire of log block itself is not good 2009-05-04 04:01 rather seems bad 2009-05-04 04:02 log block can be retired after 2 cycle of flush_log() 2009-05-04 04:03 it means we must have the almost log blocks of 3 flush_log() 2009-05-04 04:04 3 flush_log() retires oldest log blocks 2009-05-04 04:04 why? 2009-05-04 04:05 the log blocks of previous cycle is needed to know the deflush log records 2009-05-04 04:05 so, 2 previous cycle can be retired after flush_log() 2009-05-04 04:07 maybe, this would be hard to see why, without some figure 2009-05-04 04:07 so, I'll explain with chat, maybe, tommorow? 2009-05-04 04:11 well, so, I'm thinking to make the log of bfree for retired log blocks 2009-05-04 04:12 because, 2 previous log blocks is just used to know log blocks address itself 2009-05-04 04:12 if I'm not missing something 2009-05-04 06:05 -!- dcg(~dcg@234.pool80-103-0.dynamic.orange.es) has joined #tux3 2009-05-04 06:24 -!- hirofumi(~hirofumi@210.171.168.39) has joined #tux3 2009-05-04 06:39 -!- npmccallum(~npmccallu@76.177.118.207) has joined #tux3 2009-05-04 09:32 -!- tim_dimm(~timothyhu@pool-71-107-51-9.lsanca.dsl-w.verizon.net) has joined #tux3 2009-05-04 10:26 -!- dcg_(~dcg@224.pool80-103-1.dynamic.orange.es) has joined #tux3 2009-05-04 11:06 hi hirofumi 2009-05-04 11:12 hi 2009-05-04 11:13 do you have time for a hour or so? 2009-05-04 11:13 I'd like to talk about retiring the log blocks 2009-05-04 11:13 for you, always :) 2009-05-04 11:14 thanks :) 2009-05-04 11:14 ok 2009-05-04 11:14 I'll make some figure for it 2009-05-04 11:16 ok, I had a small observation that might simplify things a little 2009-05-04 11:17 http://userweb.kernel.org/~hirofumi/logblock.note 2009-05-04 11:17 which is: for the first log block in the sequence, we can record the beginning of valid log data 2009-05-04 11:17 or rather, for the entire log sequence 2009-05-04 11:18 valid log data? 2009-05-04 11:18 yes, so that we can drop part of a log block while keeping some log messages at the end of the block 2009-05-04 11:18 for example... 2009-05-04 11:19 if we are freeing (retiring) log blocks, we can write a log_bfree log messages instead of updating the bitmaps 2009-05-04 11:19 and that message goes at the end of the current log block 2009-05-04 11:19 yes 2009-05-04 11:20 it may be needed to make things simple 2009-05-04 11:20 and all the previous log blocks are freed, except for the last one 2009-05-04 11:20 and I'm thinking which is needed it? 2009-05-04 11:20 yes, basically 2009-05-04 11:20 I think this is useful for retiring the log blocks 2009-05-04 11:20 a slight simplification 2009-05-04 11:21 yes, probably 2009-05-04 11:21 it gets a little tricky otherwise 2009-05-04 11:21 yes, exactly 2009-05-04 11:21 the tricky part is trying to update bitmap blocks, it is easier to leave them alone and write log records 2009-05-04 11:21 by recording the beginning of valid log data, we don't have to start a new log block to record the bfrees 2009-05-04 11:22 I think it also reduces number of log blocks that have to be written a little 2009-05-04 11:22 ok, I will read the note 2009-05-04 11:22 and the cause of tricky is, defree is used to free *and* reservation to prevent overwrite it 2009-05-04 11:22 yes 2009-05-04 11:23 note is just to talk with the mark, (1), (2) ... 2009-05-04 11:23 and another tricky detail is, we have to be able to reconstruct the defree list 2009-05-04 11:23 as you pointed out earlier 2009-05-04 11:23 **** is flush_log cycle 2009-05-04 11:23 yes 2009-05-04 11:24 so, let me talk current situation 2009-05-04 11:24 from (1) to after (2), is needed logs 2009-05-04 11:24 you ascii picture is fine 2009-05-04 11:25 it is exactly my thinking 2009-05-04 11:25 we are on after (2) 2009-05-04 11:25 ah 2009-05-04 11:25 and after (1) contains the logs for defree/deflush 2009-05-04 11:26 so, we can't free those until next flush_log() 2009-05-04 11:26 5 minutes please 2009-05-04 11:26 yes 2009-05-04 11:26 I will be back 2009-05-04 11:26 ok 2009-05-04 11:28 and the issue is, when do we can retire log block itself? 2009-05-04 11:28 when can we retire the log blocks itself? 2009-05-04 11:30 I've added the (1a) and (2a) 2009-05-04 11:30 added to http://userweb.kernel.org/~hirofumi/logblock.note 2009-05-04 11:30 back 2009-05-04 11:31 ok 2009-05-04 11:31 I see it 2009-05-04 11:33 this is my thinking: we will write bfree log messages for all the existing log blocks except the log block containing the bfree messages themselves (the last log block) 2009-05-04 11:33 this effectively frees all log blocks except the last one, which has to be retained until the next log flush cycle 2009-05-04 11:34 now, we are on where point in figure? 2009-05-04 11:34 (2a)? 2009-05-04 11:35 what is the event between (2) and (2a)? 2009-05-04 11:35 (2) is before flush_log(), (2a) is after flush_log() 2009-05-04 11:36 ****** is in flush_log() 2009-05-04 11:36 ah, and we should have a (3) to show the next delta flush 2009-05-04 11:36 *** is durning flush_log() 2009-05-04 11:36 (1) and (1a) is not enough? 2009-05-04 11:36 it depends what we mean by "flush delta" 2009-05-04 11:37 ah 2009-05-04 11:37 to me, flush delta does not generate any disk activity itself 2009-05-04 11:37 flush delta means stage_delta() and flush_log() 2009-05-04 11:37 sorry 2009-05-04 11:37 to me, flush log does not generate any disk activity itself 2009-05-04 11:37 disk activity is only generated by flush delta 2009-05-04 11:38 flush log does something very simple 2009-05-04 11:38 btw, flush_log() is not flushing log 2009-05-04 11:38 you mean the code I wrote? 2009-05-04 11:38 yes 2009-05-04 11:38 ah, no 2009-05-04 11:38 I wrote 2009-05-04 11:38 ah 2009-05-04 11:38 ok, good 2009-05-04 11:38 however, there is no big change iirc 2009-05-04 11:39 flush_log() should do two things: write bfree records into the log and reset the sb->logbase (I think that is the variable name) 2009-05-04 11:39 btw, this is why I'm looking the new word of "flush" for flush_log() 2009-05-04 11:40 what are the two different meanings? 2009-05-04 11:40 it seems to me that there should be only one flush_log, and it should be part of delta transition 2009-05-04 11:41 flush is "write the dirty buffers to disk", and flush_log() is "flushing the btree and bitmap, then write log blocks in atomic commit" 2009-05-04 11:41 that is, sometimes part of the delta transition, if it is time to flush the log 2009-05-04 11:41 to me, these are exactly the same concept 2009-05-04 11:42 concept may be same, however the detail is confusable 2009-05-04 11:42 btw, now, I'm using to flushing the log blocks, write_log() 2009-05-04 11:43 but, it can also call flush_log() 2009-05-04 11:43 :) 2009-05-04 11:44 I want to word of "flush delta" in figure 2009-05-04 11:44 I'm thinking we are calling it "flush" 2009-05-04 11:45 flush_log() is good, until you convince me otherwise ;) 2009-05-04 11:45 :) 2009-05-04 11:46 e.g. now, I'm using flush_buffer_list() 2009-05-04 11:46 it flushes the buffers in list 2009-05-04 11:47 but, flush_log() is not meaning, to flush the log buffers 2009-05-04 11:47 it does "flush cycle" in atomic commit 2009-05-04 11:47 i.e. flushing the bitmap, btree, and log blocks 2009-05-04 11:48 e.g. if we call it fdelta like normal delta 2009-05-04 11:48 it helps me 2009-05-04 11:48 flush_log() will become flush_fdelta() 2009-05-04 11:48 it means to flush the fdelta cycle 2009-05-04 11:49 just use the name you like most 2009-05-04 11:49 i.e. I want the word like "delta" for flush cycle 2009-05-04 11:49 but, I want to share it with you 2009-05-04 11:50 yes, otherwise I won't understand it 2009-05-04 11:50 thanks 2009-05-04 11:50 anything that reduces confusion is good at this point 2009-05-04 11:51 yes 2009-05-04 11:51 so, for now, let's call flush cycle is fdelta 2009-05-04 11:52 well, back to the retire log blocks 2009-05-04 11:54 now, we are on (2a) point 2009-05-04 11:55 so, which log blocks can we free? 2009-05-04 11:56 I'm expecting to free before (1) at the between (2) and (2a) 2009-05-04 12:12 we can free every log block up to the block where the bfree log records begin 2009-05-04 12:13 bfree log records? 2009-05-04 12:13 ah 2009-05-04 12:14 new log record, now we introducing? 2009-05-04 12:37 not a new log record, just the log record for a deferred free 2009-05-04 12:38 -!- tim_dimm(~timothyhu@pool-71-107-51-9.lsanca.dsl-w.verizon.net) has joined #tux3 2009-05-04 12:38 freeing log blocks is a deferred free, just like other deferred frees 2009-05-04 12:38 um... 2009-05-04 12:39 ah, ok 2009-05-04 12:40 when do we log the deflush of log blocks? 2009-05-04 12:41 in the fdelta cycle? 2009-05-04 12:41 yes 2009-05-04 12:42 that is why I suggested adding a variable to track the start of the valid part of the log 2009-05-04 12:43 this is just an offset in the oldest log block 2009-05-04 12:44 replay ignores any log records before the "valid log offset" 2009-05-04 12:44 so log flush just needs to make the defree entries, then set the logbase and logvalid (suggested variable name) 2009-05-04 12:45 or maybe logoffset or maybe logbase_offset 2009-05-04 12:48 um... 2009-05-04 12:49 logvalid is like sb->next_logbase and sb->logbase in my patchset? 2009-05-04 12:51 flips, idea for tux3.org 2009-05-04 12:51 we should make files out of the tux3 U logs 2009-05-04 12:51 was explaining to someone about how we used to have tux3 U 2009-05-04 13:21 -!- RazvanM(~RazvanM@dazzler.isi.jhu.edu) has joined #tux3 2009-05-04 13:34 -!- dcg(~dcg@134.pool80-103-0.dynamic.orange.es) has joined #tux3 2009-05-04 13:48 hirofumi, just like that 2009-05-04 13:48 I see 2009-05-04 13:48 that is, there are two numbers to describe the start of the log 2009-05-04 13:49 a block number, and an offset within the block 2009-05-04 13:49 ah 2009-05-04 13:50 and merges to fdelta log and delta log? 2009-05-04 13:50 now, I'm writing it as separated blocks 2009-05-04 13:51 by using the log start offset, we avoid allocating a new log block 2009-05-04 13:51 I think that is the only effect 2009-05-04 13:51 but it is a nice effect 2009-05-04 13:52 the optimize things? 2009-05-04 13:52 exactly, it merges the fdelta and delta log 2009-05-04 13:52 i see 2009-05-04 13:52 that is the point, the fdelta log is not really separate 2009-05-04 13:53 at least, I think it is not really separate 2009-05-04 13:53 however, I think it makes retire log block complex a little 2009-05-04 13:54 because those log records are having different retire cycle 2009-05-04 13:55 before fdelta is retired at next fdelta 2009-05-04 13:55 but, fdelta is not 2009-05-04 13:55 um... 2009-05-04 13:56 to make sure, the log records of fdelta means logs of bitmap/btree durning flushing those 2009-05-04 14:32 sorry, time to sleep 2009-05-04 14:32 I'll think it more based on your bfree log of log block 2009-05-04 14:47 I didn't think it would make the log block retiring more complex 2009-05-04 14:47 but maybe I overlooked some issue 2009-05-04 14:48 ok, what happens at a flush cycle is, we generate some log records like bitmap bfrees and so on, then immediately discard them, because we are writing the actual bitmaps out in the same delta 2009-05-04 14:48 I may also be overlooking something, because I'm confusing more or less to handling this 2009-05-04 14:48 that is probably the confusing part 2009-05-04 14:49 yes 2009-05-04 14:49 ok, will the explanation is: we are generating some log records that we immediately throw away 2009-05-04 14:49 I've almost implemented the change_end() except this 2009-05-04 14:49 and we just do that to keep the code simple 2009-05-04 14:49 but this part is rewrited some times 2009-05-04 14:50 can I get your latest prototype and work on it with you? 2009-05-04 14:50 ok 2009-05-04 14:50 I'll push latest my patches 2009-05-04 14:50 post a link to the mailing list? 2009-05-04 14:50 ok 2009-05-04 14:50 good 2009-05-04 14:50 ok 2009-05-04 14:51 it will be a few hours before I can look at it 2009-05-04 14:53 http://userweb.kernel.org/~hirofumi/atomic.tar.gz 2009-05-04 14:53 this is my current patchset 2009-05-04 14:53 ah, I should post it to ml? 2009-05-04 14:54 ah, ok, I'll post it 2009-05-04 15:29 folks 2009-05-04 15:30 post to ml is good 2009-05-04 15:30 this is regarding atomic commits ? 2009-05-04 15:31 yes, I posted it 2009-05-04 15:31 bh, yes 2009-05-04 15:31 good :) 2009-05-04 15:31 tux3 development seemed to be stalled for a while, I was worried 2009-05-04 15:32 bh, never fear, hirofumi is hear ;-) 2009-05-04 15:32 well, I was working for it more or less, however, not in public repo 2009-05-04 15:32 here rather 2009-05-04 15:32 :) 2009-05-04 15:32 :) 2009-05-04 15:32 yeah, I was looking at the logs recently and noticed there wasn't too much activity 2009-05-04 15:33 glad to hear it's moving again publically 2009-05-04 15:33 its been way down, but there's been plenty of activity behind the scenes 2009-05-04 15:33 me too 2009-05-04 15:33 glad you noticed 2009-05-04 15:33 call it spring break if you like 2009-05-04 15:33 well, my current patchset breaks current functional 2009-05-04 15:34 so, it is not into public 2009-05-04 15:34 i noticed a crudehack in there somewhere :) 2009-05-04 15:34 maybe, until working write and replay more or less 2009-05-04 15:37 btw, time to sleep 2009-05-04 15:37 night 2009-05-04 15:37 good night 2009-05-04 15:37 night hirofumi 2009-05-04 15:37 it's already morning though :) 2009-05-04 15:38 7:38 2009-05-04 15:41 you are Daniel's Japanese twin 2009-05-04 15:41 creativity fueled by lack of sleep :-) 2009-05-04 18:55 -!- edt(~Ed@254-78.162.dsl.aei.ca) has joined #tux3 2009-05-04 23:36 -!- RazvanM(~RazvanM@pool-173-67-57-242.bltmmd.east.verizon.net) has joined #tux3